TRANSACTIONS 


OF THE 


AMERICAN MATHEMATICAL SOCIETY 


EDITED BY 


DUNHAM JACKSON 
EDWARD KASNER 


HOWARD HAWKS MITCHELL 


WITH THE COUPERATION OF 


RAYMOND W. BRINK EDWARD W.CHITTENDEN WILLIAM C. GRAUSTEIN 
OLIVE C. HAZLETT EINAR HILLE AUBREY J. KEMPNER 
JOHN R. KLINE CHARLES N. MOORE GEORGE Y. RAINICH 
JOSEPH F. RITT CAROLINE E. SEELY FRANCIS R. SHARPE 
‘ELLIS B. STOUFFER JACOB D. TAMARKIN J. H. M. WEDDERBURN 


VoLuME 30 


1928 


PUBLISHED BY THE SOCIETY 
MENASHA, WIS., AND NEW YORK 


1928 


COMPOSED, PRINTED AND BOUND BY 
The Collegiate Piress 
GEORGE BANTA PUBLISHING COMPANY 
MENASHA, WISCONSIN 


MATHEMATICS 
EPART MENT 


TABLE OF CONTENTS 


VOLUME 30, 1928 
PAGE 
Apams, C. R., of Providence, R. I. On the ane cases of the linear 
ordinary difference equation. ., « « 


ALEXANDER, J: W., of naan N. I. Topological invariants of knots 
andlinks. . . . ; 


ARCHIBALD, R. G., of New York, N. Y. beohieiio equations in division 


Ayres, W. L., of Philadelphia, Pa, Concerning the arc-curves and basic 
sets of a continuous curve... 


CopELAND, A. H., of Cambridge, Mass. Types of motion of the gyroscope. 


Davis, D. R., of Eugene, Ore. The inverse problem of the calculus of vari- 
ations in higher space . 


Dickson, L. E., of Chicago, Ill. Simpler proofs of Waring’s theorem on 


cubes, with various generalizations 


Dines, L. L., of Saskatoon, Canada. A theorem on orthogonal functions 
with an application to integral inequalities . ee ee oe 


A theorem on orthogonal sequences 


GruMAN, H. M., of Austin, Tex. aati end points of continuous 
curves and other continua. & 


Grove, V. G., of East Lansing, Mich. Tani of nets 


HuntTInctTon, E. V., of Mass. The of 
sentatives in Congress , 


Hourwiz7z, W. A., of Ithaca, N. Y. On Bell’ arithmetic of 


~ Jackson, D., of Minn. Some non-linear in 
mation 


KASNER, E., of New York, N. Y. in space of three 


The second derivative of a polygenic function es eee 
Ketcuum, P. W., of Urbana, II. functions of 
variables. 


Lang, E. P., of Chicago, Ill. The: projective differential geometry of systems 
of linear homogeneous differential equations of the first order ’ 


LANGER, R. E., of Madison, Wis., and TaMarkIN, J. D., of Providence, R. I. 
On integral equations with discontinuous kernels . ; 


LuBBEN, R. G., of Austin, Tex. Concerning limiting sets in salen spaces 


MATTHEWS, R. M., of ae, W. Va. Cubic curves and desmic sur- 
faces; second paper. 


MEars, F. M., of Ithaca, N. Y. Riesz india aw series. 


Mitter, G. A., of Urbana, Ill. Possible orders of two —- of the 
alternating and of the symmetric group. 


MItne, W. E., of Eugene, Ore. The behavior of a boundary: alia problem 
as the interval becomes infinite 


6929745 


A N\ 
737 | 
63 
85 
420 
hel 447 
803 
641 ; 
785 
453 
668 
19 
797 


iv TABLE OF CONTENTS 


Morse, M., of Cambridge, Mass. The foundations of a heise in the cal- 
culus of variations in the large 

O., of New Haven, Conn. Some on connection 
ideals and group ofa Galoisfield. . . . 

PrIERPONT, J., of New Haven, Conn. Optics in byperballe space 

Raw ies, T. H., of New Haven, Conn. The invariant —- and the 
inverse problem i in the calculus of variations 

RicHarpson, R. G. D., of Providence, R. I. A in of 
variations with an infinite number of auxiliary conditions — 

RicumonpD, D. E., of Ithaca, N. Y. Geodesics on surfaces of genus zero 

Roos, C. F., of Houston, Tex. Generalized a aun in the cal- 
culus of variations . : ‘ 

Rortu, W. E., of Madison, Wis. A “ the matric P(X) = = 

Stotnick, M. M., of Cambridge, Mass. A contribution to the nite vs 
fundamental transformations of surfaces 

Situ, H. L., of Baton Rouge, La. On relative content and Green’ s —_— 

StuRDIVANT, J. H., of Austin, Tex. Second-order linear systems with 
summable coefficients . 

TAMARKIN, J. D., of Providence, R. I. ‘See Lanoza, R. E. 

UspEnsky, J. V., of Leningrad, Russia. On Jacobi’s arithmetical theorems 
concerning the simultaneous ven of numbers by two different 
quadratic forms. 

On the of quadrature to an ‘infinite 
interval . 

Watsu, J. L., of On expansion 
in series of polynomials and in series of other analytic functions. ; 

On approximation to an a function of a ar variable 

by polynomials . 

On the degree of approximation to an anita function by m means of 

rationa) functions 

Weiss, M. J., of Stanford Vebvenden, Calif. Primitive ; groups wiih con- 

tain substitutions of prime order p and of degree 6p or 7p 

Wuysurn, G. T., of Austin, Tex. Concerning the cut points of continua 

-WuyBurn, W. M., of Austin, Tex. Second-order differential systems with 
integral and k- -point boundary conditions i 

Existence and oscillation theorems for non- a differential | sys- 
tems of the second order . 

Winper, D. V., of Bryn Mawr, Pa. A Taylor’ series 

WILLIAMSON, I. of Chicago, Ill. Conditions for associativity of division 
algebras connected with non-abelian groups ; ‘ 

ZaRYCKI, M., of Lemberg, Poland. dus Chater 

Errata, volumes 24, 30 


855 


PAGE 
213 
610 
33 
765 
155 
49 
360 
579 
190 
405 
560 
385 
542 
307 
472 
838 
333 
597 
630 
848 
126 
111 
498 


SIMPLER PROOFS OF WARING’S THEOREM ON CUBES, 
WITH VARIOUS GENERALIZATIONS* 


BY 
L. E. DICKSON 


1. Introduction. In 1770 Waring conjectured that every positive in- 
teger is a sum of nine integral cubes 20. The first proof was given by 
Wieferich;+ but owing to a numerical error he failed to treat a wide range of 
numbers corresponding tov=4. Bachmannf indicated a long method to fill 
the gap, but himself made certain errors. The latter were incorporated in 
the unsuccessful attempt by Lejneek.§ The gap was first filled by Kempner.|! 

All of these writers make use of three tables. The computation of each 
of the last two tables is considerably longer than the first. The third table 
as given by Wieferich and reproduced by Bachmann contains six errors, 
corrected by Kamke (cf. Kempner, Mathematische Annalen, loc. cit., p. 399). 
It is shown here that the last two tables may be completely avoided. The 
resulting simple proof of Waring’s theorem in §§2, 3 is based on the customary 
prime 5. The second simple proof in §4 is based on the prime 11. By §5, we 
may also use the prime 17. 

However, the main object of the paper is to prove generalizations of two 
types. Let C, denote the sum of the cubes of m undetermined integers 20. 
Waring’s theorem states that Cy represents all positive integers. It is proved 
in §§ 4, 5 that é<*+C; represents all positive integers if 1<#<23, t#20, but 
not if ¢>23. To complete the discussion for t= 20 would require the extension 
of von Sterneck’s table from 40,000 to 61,500. 

It is proved in §6 that tx*+2y?+C, represents all positive integers if 
t¥10, 15, 20, 25, 30. Also that represents all if 
1<t#<9,t#5. Various similar theorems are highly probable in view of Lemma 
8. More interesting empirical theorems on cubes were announced by the 
writer in the American Mathematical Monthly for April, 1927, and on 
biquadrates in the Bulletin of the American Mathematical Society, May- 
June, 1927. 


* Presented to the Society, April 15, 1927; received by the editors February 16, 1927. 

t Mathematische Annalen, vol. 66 (1909), pp. 99-101. 

} Niedere Zahlentheorie, vol. 2, 1910, pp. 477-8. 

§ Mathematische Annalen, vol. 70 (1911), pp. 454-6. 

|| Uber das Waringsche Problem und einige Verallgemeinerungen, Dissertation, Géttingen, 1912. 
Extract in Mathematische Annalen, vol. 72 (1912), pp. 387-399. 


1 


| 
| 
| 
| 
; 
. 


L. E. DICKSON [January 


If N is prime to 6, it is shown in §7 that every integer & is represented by 
6x?+6y?+62?+Nw*, and that we may take w20 if k2=23°N. In §8 is 
discussed the representation of all large integers by /y*+C; when /<5. 

The tables and computations in §§ 2-4, 6 and the first part of §5 were 
kindly checked with great care by Lincoln La Paz. 

2. Three lemmas needed for Waring’s theorem. We prove the follow- 
ing lemmas. 


Lemma 1. If p is a prime =2(mod 3) and if 1 is an integer not divisible by 
p, every integer not divisible by p is congruent modulo p* to a product of a cube 
by 1. 


From the positive integers <p" we omit the *~! multiples of p and 
obtain ¢=(p—1)p"-' numbers a@;,---, as. Each Ja? is not divisible by p 
and hence is congruent to one of the a’s modulo p". We shall prove that 
no two of the Ja? are congruent. It will then follow that /a},--- , la,’ are 
congruent to a;,---, da, in some order. Since every integer not divisible 
by # is congruent to a certain a;, it will therefore be congruent to a certain 
la3. 

If possible, let Ja? =/a? (mod p"). Since a;=a,x(mod p") determines an 
integer x, we have x*=1. By Euler’s theorem, x*=1(mod p*). Since ¢ is 
not divisible by 3, 6=3qg+r, r=1 or 2. Hence x"=1, x=1, a;=a,(mod p”), 
contrary to hypothesis. 


Lemma 2. Let P and e be given integers 20, such that P is of the form 
5+481. Then every integer =>P° - 223 can be represented by P*y?+6(x?+y?+2?), 
where y, x, y, 2 are integers and y =0. 


It is known that every positive integer not of the form 4’(8s+7) is a sum 
of three integral squares. Hence this is true of positive integers congruent 
modulo 16 to one of the following: 


(1) 1,2,3,4,5,6,8,9, 10, 11,13, 14. 


If n is any integer, we shall prove that 
(2) nm = + (mod 96) 


has integral solutions y, » such that O<7<22, and such that yp is one of the 
numbers (1). Then there is an integer g for which 


n= Py? + + 96g = + 6m, 169. 


When n=P* - 22%, then n=>P*y*, m=0, whence m is a sum of three integral 


2 


1928] WARING’S THEOREM 3 


squares. Thus Lemma 2 will follow if we show that (2) has solutions of the 
specified type. 
We shall first treat the case e=0: 


(3) nm = 47 + Ou (mod 96). 


The method for (3) is such that, by multiplying it by P, P?,---, we can 
deduce at once the solvability of (2). With this end in view, we omit 3 and 
11 from (1) and obtain the numbers 


(4) 1,2,4,5,6,8,9,10,13,14, 


whose products by 5 (and hence by P) are congruent modulo 16 to the same 
numbers (4) rearranged. 

At the top of the following table we list certain values of y and below them 
the least residues modulo 96 of their cubes. The body of the table shows the 
residue modulo 96 of y?+6y for certain values (4) of p. 


5 6 7 8 9 10 11 13 14 1S 17 18 .22 
244 #55 32 57 40 83 85 S6 15 17 72 88 


67 


43 3 5 
In the body of the table occur 0, 1, - - - , 95 with the exception* of 
2,10,18,26,34,42,50,58,66,74,82,90. 

The latter give all the positive integers <96 of the form 2+8r. 

But 3 and 11 are also available values of wu. For 
(6) vy = 0,2,4,6,8,10, 
the residues modulo 96 of y?+6 - 3 and y?+6 - 11 are together found to be 
the numbers (5). This can be proved without computation as follows. 


In (6), y=2g, g=0, 1, 2, 3, 4, 5. Thus y?+18=2+8(g?+2), ~*+66 
=2+8(g*+8). Hence it remains only to show that the values of g?+2 and 


* However large a 7 we take, we cannot reach an exceptional number (5). For y*+6u=2+-8r 
(mod 96) implies that y is even and hence 64=2 (mod 8), 1=3 (mod 4), u=3, 7, 11, 15 (mod 16). 
But none of these four occur in (4). 


(i107 
8 7 2 

6 7 14 33 7 35 

12 13 20 39 76 41 69 95 1 27 (29 

24 25 32 Si 88 53 64 0 

30 31 38 57 94 59 

36 37 44 63 4 65 91 93 23 

48 49 56 75 16 77 72 80 8 40 
54 55 62 81 22 83 , 
60 61 68 87 28 89 19 21 47 

7379 8 9 46 11 


4 L. E. DICKSON [January 


g?+8 are together congruent to 0, 1,---, 11 modulo 12. But g=g(mod 6). 
Hence g*+2 takes six values incongruent modulo 6 and therefore also modulo 
12. Likewise for g°+8. But g*+2=G*+8(mod 12) would imply g=G 
(mod 6), g=G, a contradiction. ° 

Hence for every integer , (3) has integral solutions y, u, OSX7y<22, pu 
in (1). 

In 5(2+8r)=2+8p, p=1+5r ranges with r over a complete set of 
residues modulo 12. In other words, the products of the numbers (5) by 5 
are congruent modulo 96 to the same numbers (5) rearranged. The same 
is true of their products by P=5+48i, since 2k -P=2k - 5(mod 96). 
Evidently the products of 0, 1,---,95 by P are congruent modulo 96 to 
0, 1,---,95 rearranged. Hence the products of the numbers in the above 
table by P are congruent to the same numbers modulo 96. Those numbers 
are therefore the residues modulo 96 of P(y*+6y) for O<y<22 and for pz 
in (4). We saw that the products Py are congruent modulo 16 to the same 
numbers (4) rearranged. Hence the residues modulo 96 of Py*+6yv for 
0<7<22 and for v in (4) are the numbers in the table and hence are the 
numbers 0, 1, - - - , 95 other than (5). 

To complete the proof of the statement concerning (2) when e=1, it 
remains to show that, by choice of y in (6) and for ¢=18 or 66, Py*++# is 
congruent modulo 96 to any assigned number in (5). Since the last was 
proved for y?+, we need only show that y* and Py’ take the same values 
modulo 96 when vy takes the values (6). Then y=2g, g=0, 1, 2, 3, 4, 5. 
Thus g*=0, 1, 8, 3, 4, 5; 5g2=0, 5, 4, 3, 8, 1(mod 12), respectively. Hence 
* and 5y* take the same values modulo 8 - 12. But the products of 5 and 
P =5+48/ by the same even number y* are congruent modulo 96. 

The insertion of the factor P may be repeated e times. This proves the 
statement concerning (2). 


Lemma 3. Given the positive numbers s and t and a number B for which 
O<Bss, t<9*s, we can find an integer i=0 such that 


(7) Bss— ti? < B+ 


Denote the last member of (7) by L. If s<L, take i=0. Next, let s2L 
and determine a real number r so that s—ir?=B. Then 


B= 24/27, 3x21. 


We may write r=i+/, where 0<f<1, and 7 is an integer 20. Since 7Sr, 
B=s-—ir'<s—ti', as desired in (7). Next, 


= tw, 


1928] WARING’S THEOREM 
where 

— (r — f)? = — — ff) < < 
since 3r21, f<1. Since B20, 


— — B< 3ér? S 3(ts%)'*. 


3. Proof of Waring’s theorem. We first prove that every integer s 
exceeding 9 - 5! is a sum of nine integral cubes 20. For this proof we take 
C=9, p=5,t=1 in our formulas. Since s>Cp** there exists an integer n24 
such that 


(8) Cp** <s < 


Write 
(9) k = 
Hence 
(10) 3(ts?)/3 < k. 
We separate two cases. First, let Cp**+2k<s. Then Cp** and Cp**+k 


are both <s. Taking them in turn for B in Lemma 3, and using (10), we 
conclude that there exist integers J and J, each =0, such that 


Cp* Ss— tl? <Cp* +k, 
+k Ss < Cp + 2k. 
Hence there are two distinct integral values J and J of i which satisfy 
(11) Cp <s— <Cp™+4+2k, i20. 


Second, let Cp**+2k>s. Then (11) holds for i=0 and (when ¢=1) for 
i=1, since the integer Cp** is less than s and hence is <s—1. 

Hence in both cases there exist two distinct integers and hence two con- 
secutive integers j7—1 and j, which are both values of i satisfying (11). 

At least one of the integers s—i(j—1)* and s—#j* is not divisible by 5. 
For, their difference is the product of ¢ by 3j72—3j+1. The double of the 
latter is congruent to (j+2)?—2, modulo 5. But 2 is not congruent to a 
square. 

Hence there exists an integer a=0 such that (11) holds when i=a, and 
such that s—éa* is not divisible by p=5. By Lemma 1, there exist integers 
b and M such that 


(12) s—ta® = p"M, 0<b< 9". 


6 L. E. DICKSON {January 


When n2=4, we have 
(13) Cp** + 2k 12p**. 


For, if we insert the value (9) of &, divide all terms by p**, and note that 
1/p"-* < 1/p?, we see that (13) holds if 


6 
(14) C+ 12. 


When C =9, p=5, this holds if 


ts — = 24.1. 
92 


By (11) with 7=a, (12) and (13), we get 
Cp < + < 129, (C — < Cp — 


Hence 
(C — 1)p?* < M < 12p**. 


Write M=N+6p?". Thus 


(15) (C — 7)p** < N < 6p", 
(16) s = ta? + + + 6p"). 
We seek integers c and m, each 20, such that 
(17) p"N = + p®- 6m, m=d? +d? +d}, 


for integers d;. Then will 
(18) s = ta? + + c® + + 6m). 
Writing A for p", we then have 


3 
(19) [(A +4)? + (A 
t= 1 


These cubes are all =0. For, if d? >A?, then m>A*=p"", and, by (17), 
p"N >6p"p", contrary to (15). Hence s is a sum of nine integral cubes 20. 
It remains to select c and m. Choose an integer e so that 


(20) e=0,1,2, e+n=0 (mod 3). 


The condition in Lemma 2 is N=5° - 22%. By (15), this will be satisfied if 
(C—7)p2*=5° - 228. When n=4, the minimum value of 2n—e is 6. Hence 
it suffices to take 


258 


1928] WARING’S THEOREM 7 
(21) (CC — 7)5* C-72 (=) = (0.88)* = 0.681472. 
5 


Thus if C= 7.682, Lemma 2 shows the’existence of integers y and m, each 20, 
such that NV =5°y?+6m, where m is a sum of three integral squares. By (20), 
5°*"y3 is the cube of an integer c=0. Thus (17) holds when p=5. 

This completes the proof that every integer s exceeding 9 - 5'* is a sum 
of nine integral cubes =>0. The same is true when s <40,000 by the table of 
von Sterneck,* which shows also that if 8042 <s<40,000, s is a sum of six 
integral cubes =>0. To utilize the latter result, let 10‘<s<9 - 5°. By Lemma 
3 with B= 10, there exists an integer u=0 satisfying 


(22) 104 < o < 104 + 3(¢s*)!4, o=s— 


We have s<5". For t<5?, the radical is <5'°. Also, 10*=542*<5*. Hence 
o<16 - 5°<48. 59, 

Apply Lemma 3 with ‘=1, B=10‘, and s replaced by o. Thus there 
exists an integer v=0 satisfying 


(23) 104 < +r < 104 + 


The radical is <4?-5®. Also, 10'<4?- 5°. Hencer<4*-5* As before, 
there exists an integer w20 satisfying 


(24) 104 S r — w® < 104 + 37r7/8 = 4- 104 = 40,000. 


Since r — w® is therefore a sum of six cubes, while s =tu?+v*+17, s is a sum of 
nine integral cubes 20. This completes the proof of Waring’s theorem. 

4. The first generalizations. Let C, denote the sum of the cubes of 
undetermined integers 20. Let ¢ be an integer 20. 


Lemma 4. The form f{,=tx*+Cg represents all positive integers <40,000 
if and only if 0<tS23. 


If ¢>23 or if t=0, Cs and hence f;, fail to represent 23. Next, let 0<#<23. 
By von Sterneck’s table, every positive integer <40,000, except 23 and 239, 
is a sum of eight integral cubes 20. It remains only to show that f; represents 
23 and 239. Take x=1. Since 

0523-—t<23, 23<239-—1< 239, 
both 23 —¢ and 239 —/ are represented by Cs. 


* Akademie der Wissenschaften, Wien, Sitzungsberichte, vol. 112, Ila (1903), pp. 1627-1666. 
Dahse’s table to 12,000 was published by Jacobi, Journal fiir Mathematik, vol. 42 (1851), p. 41; 
Werke, vol. 6, p. 323. 


L. E. DICKSON [January 


TueorEeM I. Jf 1<t523, t¥20, every positive integer is represented by 


We proceed as in §3 with p=5 or p=11 according as ¢ is not divisible by 
5 or 11, and with n=4 or m2=3, respectively. We shall find limits within 
which C may be chosen. But we refrain from making a definite choice for C 
initially, since we may need to decrease C slightly to meet the difficulty 
arising below (11) when ¢>1. Then (11) does not hold for i=1 if 


(25) Cp* >s—t. 
In the latter case, we employ a new constant C’. Then 


= Cp" - piC’/C > (s — 


will be 2s if 

Cc 

s—t 
and hence if C’>4C. Thus if C’ lies between $C and C, (8) will remain 
true after C is replaced by C’. By (25), Cp**»=s-—i+P, P>0O. By (8), 
P<ts23. Write g=P/p**. Since n24 or 23, according as p=5 or 11, g 
is very small. We take C’=C-—g. Then C’ lies between $C and C, and 
C’p*"=s—t. Hence after taking C’ as a new C, we have (8) and the desired 
two integral solutions 7 of (11) in all cases. 

For p=5, #23, (14) holds when C<9.03. Reduction to C=9 permits 

us to avoid the difficulty mentioned before. The entire proof in §3 now 
holds if p =5 and if ¢ is not divisible by 5. 


Lemma 5. Let P=11+48/ and e be given integers =0. Every integer 
>P° - 233 is represented by P*y3+6(x?+y?+2%), y20. 


We now omit 4, 5, and 13 from the available numbers (1) and have 
(26) 30,31, 04, 


whose products by P are congruent modulo 16 to the same numbers (26) 
rearranged. 
The following table shows the residues modulo 96 of y*+6y for 


y=0,1,---,23 and for uw in (26). It was computed as in §2, with also 
19?=43, 213=45, 23°=71 (mod 96). 


WARING’S THEOREM 


4 5 6 


30 


24 


In the body of the table occur 0, 1, - - - , 95 with the exception of 
(27) 0,32,64. 


Since 32P =64, 64P =32 (mod 96), and since the products of 0, 1,---, 95 
by P are evidently congruent modulo 96 to the same numbers rearranged, 
the same is true of the numbers in the table. Using the omitted value 4 of yp, 
we get 

(28) 18°+24=0, 29+24=32, 10°+24=64 (mod 96). 
Since 4(P?—1) =4(112—1) =0(mod 96), 4P2*=4, and multiplication of (28) 
by P** yields 

(29) 18°P2* ++ 24=0, 24= 32, 10°P?* + 24 = 64 (mod 96). 
This completes the proof of Lemma 5 where ¢ is even. 


Since P+1 is divisible by 12, the product of an even cube by P is con- 
gruent to its negative, modulo 96. Hence 


(30) 6P=—24, 22°3P=8, 14°P = 40 (mod 96). 
As before, multiplication of (30) by P?* yields 
(31) 223P*+24=32, 14°P*+ 24 = 64 (mod 96), 


where e=2k+1. Thus Lemma 5 follows when e is odd. 
Let p= 11, ¢<15, #411. Then (13) holds if 


6 
(32) C+ for? = 15. 


1928] 9 
y=0 1 2 3 ae 7 8 9 10 11 
6 7 14 33 70 35 a 38 46 
12 13 20 39 76 41 69 
18 19 26 45 82 47 42 73 50 58 5 
36 37 44 63 + 65 91 23 
4s 49 56 75 16 77 72 80 9 88 
54 55 62 81 22 83 78 86 94 
60 61 68 87 28 89 21 
66 67 74 93 34 95 90 25 2 27 10 53 
84 85 92 15 52 17 43 71 
Ou y=13 14 15 17 18 19 21 22 23 
12 1 29 57 
36 51 79 11 
48 8 = 40 
84 3 31 59 
‘ 


10 L. E. DICKSON [January 


When C =7.05, the left member is 11.9960. By (15), the condition in Lemma 
5 is satisfied if (C—7)112*>11° - 23%. By (20), the minimum of 2n—e for 
n=3is 6. Hence it suffices to take 


(33) (C — 7)11° = 233, C—7= 0.006868. 


Hence all the conditions on C are satisfied if C= 7.01, and the reduction from 
7.05 avoids the difficulty arising when (25) holds. Next, 


4(37? — 37 + 1) =G+5)*+1 (mod 11), 


while —1 is not congruent to a square. For C=7.01, the proof in §3 now shows 
that every integer s exceeding C - 11° is represented by /;. It remains to 
prove this also when 10‘<s<C - 11°. Consider (22) and (23). Now 


(34) = 9.033, 10* < (0.01)11°, o < (27.11)11°, 
< (9.0245)114, 104 <(0.69)115, 28-114 < 73-113, 


In place of (24), we now have 10*<r—w*<28,000. This proves Theorem I 
when #<15, ¢¥11. 

5. Thecase¢=20. First, take p=11. The proof fails if »=3, since 
(32) requires C<7. Hence n2=4, and (14) holds if C<11.3. Thus every s 
exceeding C - 11'* is represented by fx, where C?=50. But if s<C - 11", 
we obtain by (22)-—(24) the condition r—w*<152,794, which is far beyond 
the limit of von Sterneck’s table. 

A better result is obtained by taking p=17, n=>3. Lemma 2 holds also 
when P=17+48/. For, by multiplying (3) by this P, we see that every 
integer is congruent modulo 96 to Py?+6Py. But 6P=6 (mod 96). If 
3j?—3j+1=0 (mod 17), multiplication by 6 gives (j+8)?=7, whereas 7 is a 
quadratic non-residue of 17. The two conditions on C are both satisfied if 
C?=50. Then (22)-(24) yield r—w?<65,500. A still lower limit will be 
found by employing 


Lemma 6. Given the positive numbers s and t, and a number B for which 
O<Bss,t<9*s, we can find an integer i=0 such that 


Bss—t® < B+ t3r?—3r+1), = (s— B)/t. 


The proof consists in the following modification of the last part of the 
proof of Lemma 3. The condition for w<3r?—3r-+1 is 


‘1 —f[3r?- 371 +f /]>0. 


Since 0<f <1, this is evidently satisfied when r2>1+/. In the contrary case, 
i=0, r=f, and the quantity in brackets is (1—/)?>0. 


| 
| 


1928] WARING’S THEOREM 11 


By (21) with 5 replaced by 17, C—7 =0.0004411. This and condition (13) 
are both satisfied if C=7.00045. It remains to treat integers s<C - 17°. 
By Lemma 4 we may take s>40,000. By Lemma 6 with ‘=20, B=8043, 
there exists an integer u>0 such that 


8043 < o < 8043 + 20(3r7? — 3r +1), =s — 20n', 
where r?=5s/20 slightly exceeds the initial r*. Then 
log r = 3.5393787, 1? = 11,988,290, r = 3462, o — 8043 < 719,089,700. 


Apply Lemma 6 with ‘=1, B=8043, s replaced by o. Hence there exists an 
integer v=>0 such that 


8043 < r < 8043 + 3R?—3R+1, =o — 8043, 
log R = 2.9522610, R*? = 802,642.2, R = 895.9, r — 8043 < 2,405,240. 
By Lemma 6 with =1, there exists an integer w=0 such that 
8043 S r — w® < 8043 + 3p? —3p +1, p? = 7 — 8043, 
log p = 2.1270528, p? = 17,951.7, p= 134, w* < 61,497. 


Hence Theorem I would hold also for =20 provided an extension of von 
Sterneck’s table would show that every integer between 8043 and 61,497 is 
a sum of six cubes. 

To prove Waring’s theorem by means of p=17, m2=3, and the same C, 
we find by three applications of Lemma 3 with ‘=1, B=8043, that r—w* 
<42,846.7. This limit is reduced by using Lemma 6. 

6. The second generalizations. We employ two lemmas. 


LemMA 7. F,;=ly*+C;, represents all positive integers =40,000 if and only 
if l=2-6, 9-15. represents all =40,000 except 22. Fs represents all 
except 23, 239, and 428. 


By the tables of Dahse and von Sterneck, C; represents every positive 
integer < 40,000 except 


(35) 15,22,23,50, 114,167,175, 186, 212, 231, 238, 239, 303, 364,420,428, 454. 


Thus Also Fi: =C,¥23. If i>15, evidently F,;~15. Hence let 
2</<15. The successive differences of the numbers (35) are 


(36) 7,1,27,64,53,8,11,26,19,7,1,64,61,56,8, 26. 


Hence every positive difference of two numbers (35), not necessarily con- 
secutive, is 1, 7, 8, 11, oris >15. 


i 

4 


12 L. E. DICKSON [January 


First, let #7, 8, 11. If and m (n>m) are any two numbers (35), then 
n—m=l. Since n—I is therefore not one of the numbers (35), it is repre- 
sented by C;. Hence is represented by F; with y=1. 

A like result holds also if /=11. By (36) the only pair of numbers (35) 
with the difference 11 is the pair 186, 175. But 186—11 - 2?=98 is not in (35) 
and hence is represented by C;. Hence F\,=186 for y=2. 

For /=7, it remains to consider »=22 and 238, which alone exceed 
predecessors by 7, as seen from (36). But 238—7 - 2=182 is not in (35) and 
hence is represented by C;. 

Finally, for /=8, (36) shows that only n=23, 175, 239, and 428 exceed 
smaller numbers in (35) by 8. Since F; is a sum of eight cubes, it does not 
represent 23 or 239. Next, 175—8 - 2*=111 is not in (35). But 428—8 =420, 
428—8 - 28=364, and 428—8 - 3*=212 are all in (35), while 428<8 - 4¢. 
Hence F,+428. 


LemMA 8. represents all positive integers =40,000 
when 1=2-6, 9-15, and k is arbitrary; when if and only if 1k S22; 
when |=8 if and only if 1 Sk 23; but not if both k and | exceed 15. 


In the final case, F~15. The first case followsfrom Lemma7. Next, let 
l=7. Ifk=1, Fis 7y?+Cs, which represents all integers 40,000 by Lemma 
4.: If k>22, F=22 requires x=0, whereas 7y?+C,;#22 by Lemma 7. It 
remains to consider the case /=7, 1<k<22. By Lemma 7, we have only to 
verify that F=22 has integral solutions. When k=7, take x=y=1, since 
C; represents 8. When k#¥7, take x=1, y=0, since C; represents 22—k, 
which is 20, <22, and #15. 

Finally, let /=8. If k=1, apply Lemma 4. If k>23, F=23 implies 
x=0, whereas 8y?+C;~#23 by Lemma 7. Hence let 1<k=23. By Lemma 
7, we have only to verify that F represents 23, 239,428. If k¥8, take x=1, 
y=0; then F=k+C;, represents 23, since C; represents 23—k#¥15, 22, 23; 
F =239, since 239—& is not 231 and is in the interval from 216 to 237 and 
hence is represented by C;; F =428, since 428—k is not 420 and is in the 
interval from 405 to 426 and hence is represented by C;. If k=8, take 
x =y=1 and note that C; represents 7, 223, and 413. 


THEOREM II. tx*+ly?+C;, represents all positive integers if l\=2, 
#10, 15, 20, 25, 30, and if l=3, 1S5¢<9, t¥5. 


Let neither é nor / be divisible by the prime p=2(mod 3). By §§ 3, 4, 
there exists an integer a=0 such that (11) holds when i=a and such that 
s—ta is not divisible by . By Lemma 1, there exist integers b and M such 
that 


1928] WARING’S THEOREM 


(37) s — ta? = 1b? + pM, C<b< p*. 
We shall presently choose C and # so that (13) is satisfied. Using also (11) 
with i=a, we have 
< 1b? + < (C —Dp*< Cp — 
Hence 
(C < M < 12p”*. 

Write M=N+6p?*. Then 
(38) (C 6)p**< N < 6p, s = ta? + + p(N + 6p"). 

(I) Let p=5. As in (21), the condition N2=5°- 22* in Lemma 2 is 
satisfied if C—/—620.68148. Since #21, (13) fails if n=3. Hence n24. 

First, let =2 and take C=8.68148. Condition (14) gives ¢<35.076. 
Hence 34 is the maximum #. It remains to consider integers s satisfying 
10‘<s<C - 5". Since t<5*, C?<53, the radical in (22) is <5'°, anda <16 - 5% 
By Lemma 3 with B=10‘, ¢=2, and s replaced by a, there exists an integer 
v=0 such that 
(39) 104 < +r < 104 + 3(207)!/8, 27°. 


Since 20?<4® - 518, we have (24). This proves Theorem II for /=2, ¢<34, 


¢ prime to 5. 
' Second, let /=3 and take C=9.68148. By (14), #$9.6186. Hence #<9. 


Since 1C?<10*, (22) gives - 5*<7-5°. By Lemma 3 with B=104, 
t=3, and s replaced by a, 


10* r < 104 + 3(30?)/8, r= o — 3. 


Since 302<4® - 5'8, (24) holds. This proves Theorem II for /=3, #<9, 
t#5. 

Finally, if 124, then C=10.682, and (14) fails if #22. But if ¢=1, we 
have the form treated in §4. 

(II) Let p=11. Whether n2=3 or n2=4, the condition in Lemma 5 is 
satisfied if C—/—6 =0.006868, as in (33). 

First, let If and #25, (32) fails. Hence let =2, C=8.006868. 
Then C?=64.11 and (32) requires that ‘<6. But (32) holds if ¢=5 since 
(5C*)'/?<6.844. The only new case is =5. It remains to consider integers 
s satisfying 10‘<s<C - 11%. We employ (22), (39), and (24): 

104 < (0.006)11°, o < (20.538)11°, < (9.45)11, 
< 30-114 = 330- 11° < 73-113, 72/8 < 6000, + — w*® < 28,000. 


This proves Theorem II for #=5,/=2. 


13 


L. E. DICKSON [January 


Second, let n24,/=2. Using the same C, we find that (14) holds when 
#<8145. But the proof of Theorem II fails for the first new case t=10 
when s<C - 11". We employ (22), (39), and (24) with the refinement of 
replacing 10‘ by 8042. We obtain . 

o < (25.8682)118, +7 < (73.576)115, + — w® < 163,969, 


where the final number is beyond the limit 40,000 of von Sterneck’s table. 
7. Generalization of Lemmas 2 and 5. These lemmas can be general- 
ized as follows. 


TueoreM III. Jf N is a positive integer divisible by neither 2 nor 3, every 
integer* =23*N is represented by Ny*+6(x?+y?+2"), where y, x, y, 2 are in- 
tegers andy =0. 


As in the proof of Lemma 2 this will follow from 


Lemma 9. Every integer n is congruent modulo 96 to Ny*+6y for 0 <7 <23, 
with p in the set (1). 


Proof was given in §5 when V =17 (mod 48). It is true by the proof in 
Lemma 2 when NV =5, 52, 5*=29, 54=1 (mod 48). 

If N=41+48/, N=5? (mod 16). We saw that the products of the 
numbers (4) by 5 are congruent modulo 16 to the same numbers (4) re- 
arranged. Hence the same is true of their products by V. Also3N =3 - 9=11, 
11N=3 (mod 16). Hence the products of all the numbers (1) by WN are 
congruent modulo 16 to the same numbers (1) rearranged. Multiplication of 
(3) by N proves Lemma 9. 

For N =37+-48l, we proceed as in the last part of the proof of Lemma 2. 
In 37(2+8r)=2+8p, p=9+37r ranges with r over a complete set of 
residues modulo 12. Finally, 37g*=g* (mod 12). The lemma follows also for 
N =37*=13 (mod 48). 

By Lemma 5, the lemma holds when V=11 or 11°=35 (mod 48). 

Let VN =19+48/. The products of the numbers (26) by 3 and hence by 
N are congruent modulo 16 to the same numbers rearranged. Since V=1 
(mod 3), 32N=32, 64N =64 (mod 96). Since N+5 is divisible by 12, the 
product of an even cube by N is congruent to its product by —5 modulo 96. 
Hence 

N -6=—5-24=—24, N - 22%=(—5)(—8)=40, 
N - 148=(—5)(—40)=8 (mod 96). 


* Except for N=11, 19, 35, 43 (mod 48), we may replace 23 by 22. But when N=1, S=9832 
is between 21° and 22? and is not represented by 7*+6(2*+y*+2*). For, that requires p=S=4, 
+ =4 (mod 6). But no one of (1/6)(S—4*) =4 - 407, (1/6)(S—10*) =16 - 92, (1/6)(S—16*) =4 - 239 
is a sum of three squares. 


14 


1928] WARING’S THEOREM 15 


Adding 24 to each, we get 0, 64, 32, respectively. The lemma follows also 
for V =19=43 (mod 48). 
Let N =23+48/. Then N=7 (mod 16). Omitting 1, 4, 9 from (1), we get 


(40) 2,3,5,6,8, 10,11, 13,14, 


whose products by 7 are congruent modulo 16 to the same numbers permuted. 
For yu in (40), the residues modulo 96 of y*+6y are shown in the following 
table having the values of y at the top: 


1S 17 18 22 


90 
6 


In the body of the table occur 0,1,---,95 with the exception of 
0, 32, 64. But 32N =64, 64N =32 (mod 96). We proceed as in the proof of 
Lemma 5. Since N +1 is divisible by 12, the product of an even cube by NV 
is congruent to its negative modulo 96. Hence 


=— 24, 22°N=8, 144V =40 (mod 96). 


Adding 24, we get 0, 32, 64, respectively. The same proof holds for 
N =47+481. 

For N=7 or 31 (mod 48), the preceding proof is to be modified as for 
N =19+48). 

This completes the proof of Theorem III. 

If 0<”<23°N in Lemma 9, write [=y—96. Then 


(41) n = NT*+ 64 (mod 96), NT. 


If m is negative, write [=y—96w, and choose a positive integer w so that 
n=NY*. If n>23N, take ['=y. In every case, (41) holds. As in the proof 
of Lemma 2, this implies 


THEOREM IV. If N is any integer prime to 6, every integer is represented 
by NT*+6(x?+y?+2*), where the integer T may be negative. 


8. Representation of all large numbers. We prove the following 
theorem. 


0 1 2 3 5 6 7 8 
12 13 20 39 76 41 69 1 29 
18 19 26 45 82 47 42 73 SO 58 5 33 ; 
36 37 44 63 4 65 91 23 51 
48 49 56 75 16 77 72 7 80 88 35 8 24 40 f 
60 61 68 87 28 8&9 21 i 
66 67 74 93 34 95 | 25 2 27 10 S53 55 81 83 
78 79 86 & 14 22 
84 85 92 15 52 17 43 71 3 


L. E. DICKSON [January 


THEOREM V. For /=1, 2, 3, 4, or 5, Fi=ly*+C; represents all sufficiently 
large integers.* 


Let r be the real ninth root of 12/(6.9+/). Thenr>1. The number of 
primes =2 (mod 3) which exceed x and are <rx is known to increase in- 
definitely with x. Choose as x the first radical in (42). Hence for all suf- 
ficiently large integers m, there exist at least ten primes p such that 


n\119 


The product of the ten primes exceeds (m/12)!*/® and hence exceeds m if 
n>12'°. Hence not all ten are divisors of m. Henceforth, let p be a prime 
>I not dividing m and satisfying (42). By Lemma 1 there exist integers 6 
and M satisfying 


n = (mod 0<5 < 
By (42), (6.9+/)p*<n<12p*. Hence 
<n < 129°. 
Cancellation of factors p* gives 
6.9p° < M < 129%. 


Write M=N+6p*. Then 0.99°<N<6p*. Let p211. Then N>22*%. By 
Lemma 2 with e=0, N can be represented by y*+6(d? +d? +d?) with y20. 
If any |d;|=p*, then VN =>6f°, contrary to the above. Hence in 


n = + = 15° + + + 6p'(d? + d? + 


= li + (pv)? + + + 


each cube is 20. This proves Theorem V. 
The following second proof applies to numbers exceeding a much smaller 
limit. For m sufficiently large, there exist seven primes P satisfying 


(43) << P< (m/C)*, P=2 (mod3), C < 12. 
The earlier discussion applies when * is replaced by P? and gives 
M=N+6P4, <6P*. 
* For /=1, the case of 8 cubes, see Landau, Mathematische Annalen, vol. 66 (1909), pp. 102-5; 


Verteilung der Primzahlen, vol. 1, 1909, pp. 555-9. For /=2, Dickson, Bulletin of the American 
Mathematical Society, vol. 33 (1927), p. 299. 


16 
| 
3 


1928] WARING’S THEOREM 


Thus N =23!P if 


23\3 
(44) 14+6+(=) 


which holds if C=/+6.9. Then by Theorem III, N is represented by 
Py*?+6(d? +d? +d?) with y2=0. Hence 


3 
= 188+ (Py)*+ [(P? + di)? + (P? — di’), 
where each cube is =0, since each |d,;| <P*. 
We may now readily verify that all integers of a wide range are sums of 
eight cubes. For P>1150, (44) is satisfied if C=7.00001. Take n=Cm'. 
Then (43) gives 


m<P<m, r= (C/12)*, logr = 1.9609862. 


Start with m=1500. Then rm=1371.1. The ten primes =2 (mod 3) 
between 1371 and 1500 are 


1373, 1409, 1427, 1433, 1439, 1451, 1481, 1487, 1493, 1499. 


Equating the fourth to rm’, we get m’ =1567.7. Hence the last seven primes 
serve for every m from 1500 to m’. Repeating with m’ in place of m, we get 
as further P’s 1511, 1523, 1553, 1559. Hence 1487, 1493, 1499, and these 
four serve for every m from m’ to 1626.7. We advance similarly to 1705.5, 
1751.4, and M=1771.2. But the four primes between M and the seventh 
prime 1733 serving for the third interval are all =1 (mod 3). We may 
employ 41? and the last six of the seven primes, since their product by 41 
exceeds the m corresponding to M, since (43) holds when P =41?, and since 
Lemma 1 holds when # is replaced by any product P of primes each =2 
(mod 3). Hence we advance from M to 1637/r=1790.9, and thence to 
1823.7 (again using 41%), 1856.5, 4=1869.6. Lacking new primes =2 
(mod 3), we use P=11 - 167 and note that the product of 11 and the last 
six of the seven primes exceeds the corresponding to u. We therefore 
advance to 1882.7. The next 13 steps proceed to 3307.1 by means of primes 
only, the number of available new primes being 2, 1, 5, 7, 7, 6, 4, 9, 8, 7, 11, 
12, 10 respectively. 

We may also proceed from 1500 to smaller values of m. Without new 
device, we reach 1163. For the next step we have available only five primes 
1091, 1097, 1103, 1109, 1151, and P=5 - 227, 11 - 101, 23 - 47. The advance 
to 1061/r = 1160.7 requires the verification that the integers m in the interval 


| 
17 
| 
t 


18 L. E. DICKSON 


which are divisible by the five primes and one of the factors of each of the 
three P’s are actually sums of 8 cubes. With occasionally a like verification, 
we may advance in 26 steps to 821. The next step would involve serious 
additional verifications, since there are available only 761, 773, 797, 809, 
11 - 71, 17 - 47 as values of P. 

The m corresponding to the final m=821 is 10'7(21.436). Employing 
technical theory of primes, Baer* proved that every integer >23 - 10" is a 
sum of eight cubes. The interest of our work lies in its very elementary 
character. 

By two applications of Lemma 6 with ¢=1, B =8043, we find that every 
integer between 8043 and 227, 297, 300 is a sum of eight cubes. This limit 
is nearly 4% larger than that obtained by Lemma 3. 


* Beitrige sum Waringschen Problem, Dissertation, Géttingen, 1913. 


UNIvVERsITY OF CHICAGO, 
Carcaco, 


i 


CUBIC CURVES AND DESMIC SURFACES; 
SECOND PAPER* 


BY 
R. M. MATHEWS 


1. Introduction. It is evident superficially that there is some connection 
between cubic curves and desmic surfaces. It is well known that the equation 
of every cubic curve of the sixth class can be reduced to the form 


(1) y? = 4x — gox — gs = 4(x — €1)(% — €2)(% — 


and the coérdinates of a point on this curve are given parametrically as 
x=(u), y=e'(u), where 9(u) is Weierstrass’s elliptic g-function which is a 
solution of the differential equation 


(2) [o’(u) ]? = 4[(u)]* — gee(u) — gs. 


To every cubic of genus 1 there corresponds such a ¢-function and to every 
g-function there corresponds a projective class of cubics. On the other hand, 
the desmic surface may be defined analytically as the locus of a point whose 
coérdinates are 


o(u) o2( 2) o3(u) 


o(2) 


= X3 
o2(0) a3(2) 


Thus to every desmic surface corresponds a ¢-function and to every such 
function a class of projective desmic surfaces. 

This superficial analytic indication of relationship between the classes of 
curves and surfaces sets the problem of finding intimate geometrical con- 
nections. I have shown some of these relations in a former papert and now 
present others. 

2. Setting of the problem. We recall first some known facts. (i) From 
an arbitrary point A on a cubic curve Cé of the sixth class four tangents can 


* Presented to the Society, April 2, 1926; received by the editors in January, 1927. 
{ These Transactions, vol. 28 (1926) pp. 502-522. 


19 


where 


20 R. M. MATHEWS [January 


be drawn. The cross ratios of this pencil of tangents are constant as A de- 
scribes the curve, and they are the mutual ratios of the roots of 
z?—3z+2/]'/?=0 where J denotes the absolute invariant 64S%/(64S?+ 7?) 
of the cubic, or equally well of the corresponding ¢-funcgtion. (ii) If ABC 
are three collinear points on C?, then the three quadrangles of the points of 
contact of the tangents from them form a (124, 16;) Hessian configuration on 
the curve. Any two of the quadrangles are perspective from each vertex 
of the third. (iii) Three tetrahedra are in desmic formation when their 
vertices form a (12,, 16;) space configuration, any two of the tetrahedra 
being perspective from each vertex of the third. (iv) They form the base of 
a pencil of quartic surfaces, called desmic surfaces. The points are the 12 
nodes and the lines of perspectivity are the 16 lines on each surface. (v) The 
pencil contains three degenerate surfaces, namely the tetrahedra. If A=0, 
u=0, v=0 be the equations of these three surfaces, then A+y+v=0. (vi) 
If these forms be evaluated for an arbitrary point P(x) of space, not on the 
tetrahedra, then the equation of the desmic surface D through P may be 


written 
(5) + y?) + y? + y?) + + y? y?) = 0. 


(vii) The tetrahedra are also the base of a net of quadrics and the generators 
of these quadrics constitute a desmic cubic complex of lines. (viii) Now, a 
point P(x) determines a desmic surface of the pencil and is the vertex of a 
cubic cone of the complex. As shown in the former paper, this cone passes 
through the vertices of the tetrahedra and so cuts an arbitrary transversal 
plane in a cubic curve with the desmic points projecting into a Hessian 
configuration on it. Moreover, the tangent plane to D at P cuts the cone in 
the three generators which give on Cé the three collinear points ABC proper 
to the Hessian configuration. Conversely, it was shown in the first paper 
(p. 509) how to construct a set of tetrahedra and a desmic surface to cor- 
respond to a given Cé. When one of the desmic tetrahedra is taken for 
reference and a vertex of a second for unit point, then the equation of the 


cubic curve on y;=0 is 
(x — — x1y2) + (x? — — 
+ (a? — x?) yoyi(x1¥0 — xoy1) = 0.* 
3. Developments for the general case. We seek some of the consequences 
of these properties. We compute the absolute invariant of the cubic curve by 
Salmon’s formulas, and find 


(6) 


* Loc. cit. equation 11. 


1928] CUBIC CURVES AND DESMIC SURFACES, II 


fe 4(uv + vr + Au)? 
4(uv + vd + Au)® + (u — — A)*(A — 


Thus the invariant of the cubic curve and of the cubic cone is expressed in 
terms of the coefficients of the surface. The cross ratios of the four tangent 
planes through a generator of a cone, or of the four tangent lines which 
they cut on the transversal plane, are of the type form —A,:y.; and as 
these ratios must be the same for all points of D (cf. equation (5)) it follows 
that 


(7) 


All cones of a desmic cubic complex whose vertices lie on one desmic surface 
have the same absolute invariant and are projective. 


4, The assignment of ou/ov as xo, etc. is arbitrary. For a given P(x,), 
23 other points may be obtained by permuting the codrdinates, and the 
24 correspond to the symmetric group G*. This group contains as subgroup 
the symmetric group G?. We find that the 24 points lie by fours on six 
distinct desmic surfaces of the pencil and the absolute invariant is the same 
for all six surfaces, for \, u, y are merely permuted by the permutations of 
the Gs. Conversely, the absolute invariant, when equated to an arbitrary 
number, gives an equation of the 24th degree which factors into six of the 
fourth degree, and these give the six conjugate desmic surfaces of the pencil. 
Hence 


The locus of the vertices of the projective cubic cones in a desmic cubic 
complex consists of six conjugate desmic surfaces. 


5. The cross ratios —\:y have a still closer relation to the features of the 
surface. Those generators of the cone which lie in the tangent plane at P 
are three of the bitangents which can be drawn from P to D, and their second 
points of contact can be determined as follows. If the vertices of two of the 
tetrahedra be taken as the eight invariant points of a cubic involution 
x’ =1/x, then D transforms into itself and P interchanges with P’, the second 
point of contact on one of the bitangents. If tetrahedra II and III be the 
invariant base of the transformation, we shall say that the bitangents signal- 
ized in this manner are of the first system. Now the four planes determined on 
PP’A by the vertices of tetrahedron J have cross ratios of type —A:y. Hence 


The bitangents of the same system on a desmic surface subtend at the vertices 
of the proper tetrahedron four planes of constant cross ratio. 


6. By a theorem of Steiner’s these lines also cut the faces of the tetra- 
hedron in a range of the same cross ratio. Thus 


21 | 
: 


22 R. M. MATHEWS [January 


Bitangents of the same system on a desmic surface belong to a complex of 
Reye (tetrahedral complex) for which the corresponding tetrahedron of nodes is 


fundamental. 

Under a cubic involution whose eight invariant points are the vertices of two 
desmic tetrahedra, a desmic surface transforms into itself and the lines which 
join corresponding points belong to a tetrahedral complex on the third tetrahedron. 


7. Special surfaces. When the value of J is 1, 0, or ©, the correspond- 
ing cubic curve is harmonic, equianharmonic, or nodal. We determine the 


corresponding desmic surfaces. 
Let J=1, then T?=0; thus the desmic surfaces on which the vertices of 


the cubic cones lie when the cubics are harmonic are 
My—%y=0, wy = 0. 


These are obtained by setting each factor of T equal to zero and considering 
x as a variable y. As T?=0, each factor counts twice and the set of six 
desmic surfaces reduces to three. 

8. If J=0, then S*=0. The solution of the system 


S = p(uv + dru) = 0, 


A+ 0, 
gives 
=A, wp=owrd, 
where w is a complex cube root of unity. Hence the desmic surfaces for 
equianharmonic cubics are 


yey? + + ol yi + + y? + = 0 
and 
ye + + + vy?) + vy? + y?) = 0. 


These are the only two distinct desmic surfaces in the set of six, for all the 
others obtained by permutation of the coefficients can also be obtained in this 
instance by multiplying each of the equations by w and w”. 

9. If [=o, then S*+64T*=0. This corresponds to g? —27g? =0 for 
the elliptic g-function. If the four numbers of the set (x?) be taken as the 
roots of the quartic 


f(z) = aoz* + 4a + + + a, = 0, 


then g#—27g? is the discriminant of that equation and when equated to 
zero implies that at least two of the roots are equal. Therefore point (x) is 
on one of the six pairs of planes y? —y? =0. These are just the degenerate 
pairs of planes which taken again in twos make the three degenerate desmic 


1928] CUBIC CURVES AND DESMIC SURFACES, II 23 


surfaces of the pencil. Thus the locus of (x) for a nodal cubic consists of the 
three degenerate desmic surfaces of the pencil: 


w=0, »=0. 


Each of the corresponding nodal cubics degenerates to a line and a conic. 

10. For cuspidal cubics T and S are zero simultaneously. We find that 
the corresponding cubics degenerate to three concurrent lines. 

11. From these last two results we see that there is no correspondence 
between proper desmic surfaces and proper singular cubics; or otherwise, 
no cone in a desmic complex has a single double line. 

12. Associated surfaces of order eight. It happens that the eighteen 
edges of the desmic tetrahedra are also the edges, in a different grouping, of a 
counter-set of desmic tetrahedra, and there is another pencil of desmic 
surfaces D’ on these (First paper, p. 507). Through an arbitrary point P, 
there passes a surface of each pencil. As any one surface D is cut by every 
surface of the other pencil, the values of the absolute invariant for the two 
surfaces through a point are different, in general. If we seek the locus of 
points for which they are equal, that is, form the combinations of type 
A.més —AJu2=0, we find six surfaces of the eighth order whose coefficients 
are numerical. A typical equation is 


(x? af) (x? Do xe x2 x? + 8x0x1%2%3) 


+ — x?)(x? — x?) = 0. 


Thus we have a set of surfaces intimately connected with a set of desmic 
tetrahedra and its counter-set. Each surface passes through the 16 lines of 
the pencil {D} and the 16 of pencil {D’}. Moreover, each passes through 
six of the eighteen edges of the tetrahedra. 


UNIVERSITY OF WEST VIRGINIA, 
Morcantown, W. Va. 


| 
f 


POSSIBLE ORDERS OF TWO GENERATORS OF THE 
ALTERNATING AND OF THE SYMMETRIC GROUP* 


BY 
G. A. MILLER 


It is well known that every alternating and every symmetric group can 
be generated by two of its substitutions, and that two such generating sub- 
stitutions can usually be selected in a large number of different ways. Since 
two operators of order 2 must always generate a dihedral group it is evident 
that no alternating group can be generated by two of its substitutions of 
order 2, and that the only symmetric group which can be thus generated is 
the symmetric group of order6. On the other hand, it is known that with very 
few exceptions, relating to groups whose degrees do not exceed 8, every 
alternating group and every symmetric group can be generated by two of 
its substitutions of orders 2 and 3 respectively.t In the present article we 
shall prove that whenever an alternating group involves a substitution of 
order />3 then it contains two substitutions of orders 2 and / respectively 
which generate the entire group. We shall also determine the degrees of all 
the symmetric groups to which a similar theorem does not apply. 

Before proving this general theorem, it may be desirable to consider the 
more elementary question of generating an alternating or a symmetric group 
by two of its substitutions which are separately composed of a single cycle. 
When neither of the two numbers /,, /, exceeds m but their sum exceeds n 
it is obvious that two substitutions s,, sz which are separately composed of 
a single cycle, and whose orders are /;, /, respectively, can be so selected 
that they generate a transitive group of degree n, and that half the substitu- 
tions of this transitive group are negative whenever at least one of the two 
numbers /;, /, is even. If s; and sz do not have all their letters in common we 
may suppose that the common letters are arranged in the same order in both 
of these substitutions and hence their commutator is either of the form abc 
or of the form ab - cd. If both of them are of degree m we may suppose that 
all their letters are arranged in the same order with the exception that two 
adjacent letters are interchanged. Hence their commutator is of the form 
abc in this case. The group generated by s;, s2 is obviously multiply transitive 


* Presented to the Society, December 31, 1926; received by the editors, December 20, 1926. 
+ G. A. Miller, Bulletin of the American Mathematical Society, vol. 7 (1901), p. 424. 


24 


ALTERNATING AND SYMMETRIC GROUPS 25 


whenever ” >3 and hence it must be alternating or symmetric, at least when 
n>8, since the class of a primitive group which is neither alternating nor 
symmetric must exceed 4 whenever its degree exceeds 8. When m does not 
exceed 8 it is easy to verify directly that two such substitutions can be so 
selected as to generate the alternating groups when both of the numbers h, / 
are odd, and the symmetric group when at least one of these numbers is even. 
These results may be stated in the form of a theorem as follows: Jf h, i 
represent a pair of numbers, each being greater than unity, such that neither 
exceeds n but their sum exceeds n, then it is aiways possible to find two cycles of 
orders l, and lz respectively such that they generate the alternating group of 
degree n whenever both 1, and |, are odd. When at least one of these two numbers 
is even they generate the symmetric group of the same degree. 

From this theorem it results directly that if /,, represents any pair of 
positive integers such that each exceeds unity then it is always possible to 
find two cycles s;, s2 of orders /,, /, respectively such that the group generated 
by si, Se is either alternating or symmetric and has an arbitrary one of 
l,, Sle, different degrees. For instance, pairs of cycles of order 9 can be 
selected such that each pair generates an arbitrary one of nine different 
alternating groups, viz., the alternating groups whose degrees vary from 
9 to 17 inclusive, while an arbitrary one of the nine different symmetric 
groups of the degrees 10 to 18 can be generated by a cycle of order 9 and a 
cycle of order 10. This constitutes a complete solution of the elementary 
problemof generating alternating or symmetric groups by means of twocycles 
on their letters. 

For the proof of the general theorem noted in the first paragraph it will 
often be convenient to use the following obvious theorem: Jf a transitive 
group of degree n contains a cycle of prime order p, where p satisfies the condition 
n/2<psn—3, it must be either alternating or symmetric. It is known that 
whenever ”>7 it is always possible to find such a prime number.* 

It will also frequently be desirable to use the following theorem: 


If a transitive group is generated by two substitutions s,, Ss, and if one of these 
substitutions s2 involves one and only one cycle of a given prime degree while the 
other does not transform all the letters of this cycle into letters which do not occur 
in it, then the transitive group generated by si, Sz is either alternating or sym- 
meiric whenever its degree exceeds the degree of this cycle by more than 2. 


Since it is well known that a primitive group of degree m which does not 
include the alternating group of this degree can not involve a substitution 


* G. A. Miller, School Science and Mathematics, vol. 21 (1921), p. 874. 


i 


26 G. A. MILLER [January 


composed of a single cycle of prime degree less than m —2, it is only necessary 
to show that s, and s2 generate a primitive group. To do this we may trans- 
form by s; a power of s2 which is equal to this prime cycle and thus obtain a 
prime cycle which has not all its letters in common with the former but has at 
least one letter in common therewith. Hence the group G generated by si, se 
involves two such prime cycles which generate a doubly transitive group 
whose degree is just one larger than the degree of one of these cycles. Since 
a transitive group which contains a primitive subgroup of lower degree is 
itself primitive unless all the letters of this primitive subgroup appear in 
one of the sets of every one of its possible systems of imprimitivity, it 
results almost directly that G must be primitive and hence the theorem in 
question has been established. 

To exhibit the nature of the limitations imposed in this theorem and at 
the same time prove a somewhat striking theorem it may be noted here that 
if Ss, represents a substitution of composite order kik. and is composed of two 
cycles of orders kik2 and k, respectively, and if s, is a transposition which involves 
one letter from each of these two cycles, then the group generated by sy, S2 1s 
imprimitive and contains invariantly the direct product of k, symmetric groups 
which are separately of degree k2+1. A proof of this theorem results from the 
following consideration. The commutator of s, and s2*: is a cycle of order 3. 
The symmetric group of degree 3 generated by this commutator and s; 
has two letters in common with a cycle of s.*: and these two letters are 
adjacent in this cycle. Hence G involves the symmetric group of degree 
k.+1 in view of the theorem noted near the close of the preceding paragraph. 
This symmetric group is transformed by sz into k; symmetric groups such 
that no two of them have a letter in common. The order of G is k,; times 
that of the direct product of these symmetric groups. The simplest illustra- 
tion of such a group is the transitive group of degree 6 and of order 72. In 
this case =k, =2. 

Another elementary theorem which will be very useful in the solution of 
our general problem may be stated as follows: 


If s1, S2 generate a transitive group and if s, has only one letter in common 
with some power of s2 and if the letter by which this common letter is replaced 
in s, does not appear in a cycle of s2 whose order is a multiple of the number of 
cycles in the given power of S2, then this transitive group includes the alternating 
group of its degree. ° 


The proof of this theorem is similar to that noted in the preceding para- 
graph. The commutator of the given power of sz and s, is again a cycle of 


1928] ALTERNATING AND SYMMETRIC GROUPS 27 


order 3 and hence G involves the alternating group on a number of letters 
which is at least one larger than the order of a cycle in the said power of S-. 
This alternating group involves letters from at most two cycles of sz. Hence 
Sz would have to transform it into alternating groups on sets of distinct 
letters if the theorem were not true. As this is impossible from the conditions 
noted in the theorem it results that the theorem is established. When the 
given power of sz is a single cycle it is clear that the condition as regards 
the letter by which the common letter is replaced in it may be omitted in the 
theorem. 

In what follows s. will represent a substitution of order />3 while s; 
will represent a substitution of order 2. The smallest possible degree of s2 
is the sum of the highest powers of the prime power factors of /, and when / 
is even, s2 must be negative both for this smallest degree and also for the 
next larger degree of the symmetric group in which it appears. In every 
larger symmetric group there is a positive as well as a negative substitution 
of order /. It will be assumed that s, and s, have been so constructed that they 
generate a transitive group of degree m, and it results directly from the 
theorems noted above that when / is a given even number, 5;, s2 can be so 
chosen that they generate the symmetric group when m has the smallest 
possible value or the next larger value. In what follows we may therefore 
always assume that m has a larger value. When / is odd, s2 must be positive, 
and when / is even it will be assumed that s2 is positive unless the contrary 
is stated, and that the degree of sz is not less than m—?,+1, where p, is the 
smallest prime factor of / when / is either odd or divisible by 3. When 
neither of these two conditions is satisfied then the degree of s2 may be as- 
sumed to be at least »—3. Moreover, it will generally be assumed that the 
cycles of se appear in descending order of magnitude in case there is a dif- 
ference in their orders. 

When / is divisible by at least three distinct prime numbers it is obvious 
that s, can be so selected that it has only one ietter in common with the firse 
cycle of s, and that all the other cycles may be assumed to be of lower primt 
power orders. Moreover, when it is desirable to add to s,; another trans- 
position in order to give it the suitable sign we may form this, in case the 
order of the first cycle is divisible by the square of a prime number, on letters 
of this first cycle in such a way that the commutator of this cycle and s; is 
composed of a cycle of order 3 and of two transpositions. The square of 
this commutator is therefore a cycle of order 3 and may be used just as the 
commutator was used in the preceding case. When the order of this first 
cycle is not divisible by the square of a prime number, a transposition on the 
letters of the first cycle of s, may be added arbitrarily. Hence sz and s; can 


i 


28 G. A. MILLER [January 


always be so selected as to generate either the alternating group of degree n 
or the symmetric group of this degree, as may be desired, whenever / is 
divisible by at least three distinct prime numbers and the group concerned 
contains a substitution of order /. When / is divisible by two distinct odd 
prime numbers, or by one such number and 4, the remarks which have just 
been made still apply, and hence we may assume in what follows that / is 
either a power of a prime number or the double of a power of an odd prime 
number. 

When /=2,, where ~,>3 is a prime number, it results again directly 
from the preceding theorems that s; and sz can be so selected as to generate 
either the symmetric or the alternating group, as may be desired, whenever 
the group in question involves a substitution of order/. When /=2p¢,a>1, 
and #; any odd prime number, it is easy to prove that s;, s; can be so chosen 
that their product contains a cycle of order ~, as defined above, and that 
another transposition can be added to s; so as to give it the proper sign 
without affecting this cycle of order ». A simple proof of this fact may be 
given as follows. First, select s; so as to connect the last letter of the first 
cycle of s, with the first letter of the second cycle, and the first letter of every 
other cycle with the second letter of the preceding cycle. When the degree 
Se is not m, we connect also the last letter of sz with a letter not found in so, 
and when m exceeds the degree of s. by more than 1 we may connect the 
additional letter or letters with the second or the second and third letter of 
the first cycle of se. It was noted above that there could not be more than two 
such additional letters. When s/ is selected in this way it is obvious that 
si S2 is a single cycle of order n. If the (p+1)th letter of this cycle, counting 
from the next to the last letter of the first cycle of s2, is not in si, we add to 
si the transposition composed of this letter and the next to the last letter of 
the first cycle in ss. The product of s, and the s, thus obtained will then involve 
a cycle of order p and this cycle will not be affected by adding a properly 
chosen transposition to s, to give it the desired sign. 

It remains to consider the case when the (p+1)th letter of the given cycle 
of order appears in s/. If in this case the letter which precedes this (p+1)th 
letter does not appear in s/ we start our cycle with the third letter from the 
end of the first cycle of sz and proceed as before. If both the (p+1)th letter 
and the preceding letter of the cycle of order m appear in s{ and ~,:>3 we 
start our cycle with the sixth letter from the end of the first cycle of sz and 
proceed as before. If ~;=3 we start with the fourth letter from the end of 
this cycle. If sz has been so chosen that it involves as small a number of 
transpositions as possible, no other case can present itself and hence it remains 


1928] ALTERNATING AND SYMMETRIC GROUPS 29 


to consider only the cases when / =6, and when / is a power of a single prime 
number. 

When /=6 and s: is positive, 1 >6. When n=7 or 8, it follows directiy 
from the general theorems noted above that s;, s2 can be suitably selected. 
When ”>7 it may be assumed that the first cycle in s, is of order 6 and that 
S$: involves no more than one transposition. It may also be assumed that its 
degree is not less than n—2, since cycles of order 3 mey be added to it if 
necessary to increase its degree. Hence it is obvious that s, and sz may be 
so selected that their product involves a cycle of order p and that s; is either 
positive or negative as may be desired. It remains to consider the case when 
lis a power of a single prime number and />3. 

The case when /=,, p; being a prime number, is especially interesting 
since there is an infinite number of values of m, one and only one for each 
such prime number, such that the symmetric group of degree m contains a 
substitution of order / but cannot be generated by two operators of orders 2 
and / respectively. The fact that 26,—1 is such a value of m is obvious since 
$s, must then be of degree 2,—2 and hence it must be positive. This sub- 
stitution and s, generate the alternating group of degree m according to the 
general theorems noted above, and hence it remains to prove that for every 
other value of it is possible to find two substitutions of orders 2 and / 
respectively which generate either the alternating group or the symmetric 
group of degree as may be desired. When <2 ,, this requires no further 
proof since it is included in a general theorem noted above. When n23p; 
it is obvious that s;, s2 can be so chosen that their product involves a cycle of 
order ~. When n=2p,+k, k<pi, we may first consider the case when 
k=p,—1. If p>n—p/, it must be at least equal tom—f,+2. Hence we may 
connect the letters of the second cycle of sz by means of s/ with the letters 
which do not appear in sz and add a suitable transposition to s; so as to obtain 
a cycle of order p in the product of s/ and sz. An additional transposition on 
the letters of the first cycle of sz may be added to s/ so as to give it the de- 
sired sign without affecting this cycle of order p, since p cannot exceed n—3. 
When p=n-—p,, then we may connect one of the letters not found in sz 
with the first letter in the first cycle of s, and the other letters not found in sz 
with letters of the second cycle of s. When p<n—p,; we can evidently 
proceed in a similar way. Finally, when k<p~,—1 we may again connect 
by s; the letters of the second cycle of sz with those not found in s, and thus 
obtain suitable forms for s; and se. Hence the following theorem. If 1>3 
represents a prime number which divides the order of the symmetric group of 
degree n~21—1, then it is always possible to find two operators of orders | and 2 
respectively which generate this symmetric group and also two such operators 


| 
i | 
q 
{ 
| 
| 


30 G. A. MILLER [January 


which generate the alternating group of this degree. When n=21—1 it is possible 
to find two such generators of the alternating group of degree n but it is impossible 
to find two such generators of the symmetric group of this degree. 

When / = p#, where #; is any odd prime number and a >1, the substitution 
S2 may be assumed to be of degree n—k, k<p,. If s2 involves more than one 
cycle we may again suppose that s/ connects the last letter of the first cycle 
of s2 with the first letter of the second cycle and the first letter of every other 
cycle with the second letter of the preceding cycle. Moreover, s{ connects 
letters of the last cycle in s, with the k letters which do not appear in se. 
The product ses; is a cycle of degree n, and if the (p+1)th letter of this cycle, 
counting from the next to the last letter of the first cycle of s:, does not appear 
in s{ we adjoin to s/ a transposition composed of this letter and the next to 
the last letter in the first cycle of s2. If the said (p+1)th letter appears in s/, 
the preceding letter cannot have this property, and hence we begin our cycle 
with the third letter from the end of the first cycle and adjoin to s/ the trans- 
position composed of this letter and the pth letter in the said cycle of order n. 
In both cases we obtain a value of s/ such that ses{ involves a cycle of order 
p, and that we can add another transposition to s/ , in case we desire to change 
its sign, without affecting this cycle of order p. When s: involves only one 
cycle, the & letters which do not appear in sz may be connected with the 
first k letters of s, and we may begin our cycle of order with the last letter 
of se. The (p+1)th letter of this cycle cannot now appear in s{ and hence 
we may adjoin to s/ the transposition composed of this letter and the last 
letter of s: in order to obtain a cycle of order p in the product of s25,; and an 
additional transposition can be added to this s, without affecting this cycle. 
When / is a power of 2 which is divisible by 8, similar considerations obviously 
apply, and hence it remains only to consider the case where / =4. 

While an infinite number of exceptions presented themselves when />3 
was assumed to be an odd prime number, one for each such prime, there is 
only one exception when /=4, since every alternating group which involves 
substitutions of order 4 can be generated by two operations of orders 2 and 
4 respectively, and every symmetric group, except the symmetric group of 
degree 6, can be generated by two such operators whenever its order is 
divisible by 4. To prove this theorem it may be assumed that 52 is positive 
and that its degree is n—k, k<2, and that s2 involves at most three trans- 
positions. When k=1 the letter which is not found in sz is connected by s/ 
with the last letter of s2, while s{ connects the other letters of sz as in the pre- 
ceding cases so that s.s/ is again a single cycle of order ». When n>19, se 
involves at least four cycles of order 4, and when k =0, s/ is positive when the 
transposition is adjoined to give a cycle of order p in s.s{. Hence when 


‘ 
| 


1928] ALTERNATING AND SYMMETRIC GROUPS 31 


p=n-—+3 it is not necessary to adjoin to s/ an additional transposition to 
obtain generators of the alternating group. Generators of the symmetric 
group in this case may be obtained by replacing a cycle of order 4 in s; by 
two transpositions. When p<n-—3 it is clearly possible to assume that s¢ is 
always positive and to adjoin a transposition to s/ without affecting the cycle 
oforder p. Hence two substitutions of orders 2 and 4 respectively can always 
be found so that they generate either the alternating or the symmetric group 
of degree u as may be desired whenever n > 19. 

When 7<<20 the value of » can be so selected as to make the de- 
termination of two possible generators s;, ss very simple. The groups of 
degrees 6 and 7 are so well known that it seems unnecessary to give here two 
substitutions of orders 2 and 4 respectively which generate the alternating 
groups of these degrees or the symmetric group of degree 7 as may be desired. 
On the other hand, it may be of some interest to give an outline of a proof 
that the symmetric group of degree 6 cannot be thus generated. If it could 
be generated by two such substitutions we may assume that s: would be one 
of the following two substitutions, abcd, abcd - ef. Since the separate groups 
generated by these substitutions are transformed into themselves by a group 
of order 16 on these letters we need to use for s; only one of each set of con- 
jugates under this group. When sz is the former of the two given substitutions 
s; may therefore be assumed to be one of the following four substitutions: 


ce-df, be-df, ab-ce-df, ac- be: df. 


In the first case the commutator of s2 and s; would be adcef. This com- 
mutator and s,? generate the simple group of order 60 since their product is of 
order 3. As this group is invariant under s; and se, these two substitutions 
generate the triply transitive group of order 120. The second and fourth 
substitutions to be used for s,; evidently generate an imprimitive group with 
$2, while the third and sz again generate the triply transitive group of order 
120, since s,52 in this case is acedf and the square of this into s,* is again of 
order 3. Hence these two substitutions generate the simple group of order 
60 which is invariant under s; and se. 

When s.=abcd - ef it may be assumed that s; is one of the following 
three substitutions: 

de, ab-cf-de, ac- bf- de. 


In the first case it follows from a general theorem noted above the s; and sz 
generate a group of order 72. In the other two cases it is obvious that they 
must also generate an imprimitive group. Hence it has been proved that the 
symmetric group of degree 6 can not be generated by two of its sub- 
stitutions of orders 2 and 4 respectively. That is, 


4 
it 
4 
| 
{ 
H 
; 


32 G. A. MILLER 


Every symmetric group whose order is divisible by 4 except the symmetric 
group of degree 6 can be generated by two operators of orders 2 and 4 re- 
spectively, and every alternating group which involves operators of order 4 can 
be generated by two such operators. 


UNIVERSITY oF ILLINOIS, 
Ursana, ILL. 


| 
} 


OPTICS IN HYPERBOLIC SPACE* 


BY 
JAMES PIERPONT 


1. INTRODUCTION 
With Riemann we define the metric of H-spacet by 


d d d do? 
(1) 
E 
4R? 
where 


H-straights are defined by 
(3) fas = 0. 


In order to have a model of this space in which we can see the figures em- 

ployed we may regard the x1, x2, x; as rectangular cartesian codrdinates. Then 

the images of points of H; are points within the e-sphere \ =0 which we call 

the d-sphere. In this model H-straights are e-circles cutting \ = Oorthogonally. 
It is convenient to introduce new variables 


4R? 


(4) Xi, i= 1,2,3 R(u/d), 


where 
+27 + x3 + 
H-planes are defined by a linear relation 
+ + + ag, = 0. 


In the model they are e-spheres cutting \=0 orthogonally. The intersection 
of two H-planes are H-straights. H-planes through the origin O are also 
e-planes and the same is true of straights. The H-angle between two curves 
or surfaces or a curve and a surface is the same as the corresponding angle 


* Presented to the Society, December 29, 1926; received by the editors December 20, 1926. 
¢ For H- read hyperbolic; for e- read euclidean. 


33 


‘ 
| 
4 
| | 
t 
| 
3 
} 


34 JAMES PIERPONT [January 


in the model. Figures may be moved about freely in H-space as in e-space. 
The coérdinates z satisfy 


(5) {22} =s2 +27 +22 — 22 = — R?. 


Besides the points so far considered we have certain ideal points, viz., those 
lying on the A-sphere; they are at an infinite distance away from any ordinary 
point. Their z codrdinates satisfy the relation 


(6) {22} = 0. 


The four planes z; =0, z.=0, z;=0, form a 
tetrahedron, the plane z,=0 being imaginary. 
It may be represented diagramatically by 
Fig. 1. The vertex A; is opposite z,.=0. All 
straights perpendicular to z,=0 meet in the 
vertex A,. The displacement 


A, 


= zesinh@+23cosh@, = 2% Fic. 1 


= %¢ = 2,cosh-+ zssinh @, 


defines a rotation @ about A;. It leaves the plane z,=0 unaltered, a figure in 
this plane being merely moved into a congruent figure. 


2. REFLECTION AND REFRACTION ON A PLANE SURFACE 


We shall suppose that the path of a ray of light in a heterogeneous medium 
satisfies 


(1) [nas = 0, 


where m, the index, is a function of the codrdinates. x. When m=constant 
this becomes 


(2) fas = 0, 


i.e., the path of a ray of light is a straight. 

Consider two media of indices n, n’ separated by an H-plane. A ray 
issuing from A arrives at A’. It is easy to see that the path lies in a plane 
normal to the boundary. We suppose it lies in the x, y plane. We must 
choose B so that nAB+n’'BA’ or s=np+n’'p’ isa minimum. Then 


(3) 


A, 

H 

ds dp dp’ 
— = n— + = 0. 

= dx dx dx 


1928] OPTICS IN HYPERBOLIC SPACE 35 


Let CB=x, BC’ =x’, x+x'=c, a constant, and therefore dx+dx’=0. Then 
cosh(p/R) = cosh(a/R)cosh (x/R), cosh (p’/R) = cosh (a’/R) cosh (x’/R) ; 
therefore 


= (2/R) dp’ = cosh (a’/R) (x’/R) 
sinh (p/R) dx sinh (p’/R) 


d 
oP = cosh (a/R) 
dx 


These in (3) give 
sinh (/R) 


tanh (x/R) tanh (x’/R) 
~ “Sinh (p/R) sinh 


cos @ 


These in (4) give 
cos @ cos a’ 


n cosh (a/ o/R) = cosh (a’/ 


therefore 
‘ sin B 
(5) ncosa=n' cosa’ or 
sin B 


which is the law of sines as in e-geometry. 


A | 
Ww 
a | | 
O AB 
| 
a 
4 
2 
Fic. 2 
But 
n' 
=) 
n 


36 JAMES PIERPONT [January 


In case of reflection, m =m’; we find in similar manner that the angle of 
incidence equals the angle of reflection. 
We see at once the truth of the following 


THEOREM. The image of an object obtained by reflection on a plane 
mirror is the congruent figure back of the mirror and at the same distance as the 
object in front of it. 


Let us consider now refraction, on a plane surface, say z;=0. 


420 


NV. P 


B 


Fic. 3 


In Fig. 3 a ray issues from A, and strikes the boundary surface at P, 
making the angle a’ with the normal PN. The refracted ray PB’ makes the 
angle 6’ with the normal, where 


sin 
(6) = n <1, say. 
sin @ 
Produced backwards, it cuts z,.=0 at B. We set 
ata =90°, 8+ p’=90°, AO=a, BO=b, PO=>p 
in H-measure. Set also 
C =cosh(a/R), S =sinh(e¢/R), T = tanh (a/R). 


Then 


tanh (p/R) T 
= ; tana = ———_—_—__ - 
S sinh (p/R) 


an @ 
Thus 


(7) tana’ = 


C tan @ 
(1 — S? tan? 


while 8’ is given by (6). 


| 
| | 
a! 
Biv ; 
P 
2z=0 
A a O 


1928] OPTICS IN HYPERBOLIC SPACE 


Let us suppose @ is small; then (7) gives 
tana’ = Co/(1 — = or a’ = C8, 
neglecting #*. Thus | 
= na’ = 


Next we have 
S tan @ 


sinh (p/R) = G 
(8) 


n 


tanh (b/R) = 


a constant, neglecting higher powers of 8. 


Suppose now we revolve Fig. 3 about OA. The symmetry of the figure 
gives the following 


THEOREM. A nearly normal pencil of rays issuing from A forms a virtual 
image at B, at a distance b given by (8). 


Since tanh (6/R) <1 we see that when a is such that 
(9) tanh (a/R) =n, 
the ray PB does not meet the axis OA. Hence 


THEOREM. When the pencil enters a denser medium no image is formed 
if A is at a distance a satisfying (9). 


A small rotation @ about the vertex 
A, opposite the plane z,=0 in Fig. 4 
moves O to O’ and LM to L’M’, such 
that A’O’=AO, BO=B’0O’. As A’O’ is 
normal to the plane z=0,. we see the 
rays from A’ behave in the same manner 
as those from A. Thus they meet in B’; 
hence 


THEOREM. The small figure AA’ 
has BB’ as its image. Fic. 4 


Let 6 be the distance of A’ to LM and 38’ the distance of B’ to this line. 
Then 


sinh (6/R) = cosh (p/R) sin@, p=OA’. 


t 

37 
— §2 tan? 
| 
| 


38 JAMES PIERPONT 


Similarly 
sinh (8’/R) = cosh (p’/R) sin@, p’ = OB’. 


Hence 


sinh (/R) cosh (p’/R) 
sinh (6/R) cosh (p/R) 


This we may call the magnification of the image. 


By (8) 
tanh (p’/R) = (1/n) tanh (p/R). 


Thus 


n n 
cosh? (p/R) — sinh? (p/R)]"? [1 — (1 — m*) cosh* (p/R)]"? 


This becomes imaginary if 


cosh? (p/R) > 1, or tanh (p/R) > n. 


1— 
It is not difficult to prove the 


THEOREM. [If the pencil is not narrow, the rays issuing from A do not 
meet in a point. 


3. CENTRAL OPTICAL IMAGERY 


As a first approximation to the path of light through a system of lenses 
the theory of collineation was shown by Maxwell and Abbe to be of extreme 
value. We shall now show that H-optics does not have this elegant tool to 


/ 


A 


(January 
m 1 
/ 
O \ O \ 
/ 
Fic. 5 


1928} OPTICS IN HYPERBOLIC SPACE 39 


work with. We suppose that we are dealing with a symmetrical optical 
system whose axis is the x-axis. The collineation defined by this system must 
have the form 


= 01121 + + + 


(1) 


= 04121 + + + 


Let A, B be two symmetric points in the object space relative to the axis ,, 
and A’, B’ their images. If the codrdinates of A are (21, 22, zs, 24), the co- 
ordinates of B are (2:, 24) while the codrdinates of B’ are 
(zi, 24). These in (1) give 
Zi = — G12%2 — + 
= — — + 
(2) 
= 43121 — 3222 23323 + 
= 4121 — 4222 4323 + 
Comparing (1), (2) we get 
= 3 = 0, an 


Thus (1) reduces to 


(3) Zi = + Zt = + Zi = + 
= 4121 + 14424. 


Since the collineation is central we may restrict ourselves to a plane, say the 
241%2 plane. We may thus write (3) 


(4) ai = az; + bz3, = cZe, = a2; + Baz. 

Since the codrdinates z’ must satisfy the relation §1, (5), we find that 
(5) a—ea’=1, —1, ab = af. 
Hence 

(6) Zi = a2, + bzg, = 522, 23 = + 


where 6?=e?=1. 
Now the equations (6) define a displacement , i.e. the image is merely a 
congruent figure of the object. Hence 


THEOREM. [If the image afforded by a central optical system is the result 
of a collineation, the image is an exact replica of the object without magnification. 


‘ 
q 
A 
, i 
q 
a 
| 
» 
i 


40 JAMES PIERPONT [January 


If the codrdinates of an ideal point are set in (6) we find {z’?} =0; hence 


THEOREM. As a point A recedes to infinity, its image A’ does the same. 
4. REFLECTION ON A SPHERE 


If a ray issuing from A 
meets the spherical mirror C at 
a point B, it is reflected along 
BB’: 

OA=a, OB=r, AB=c, 
OA’ =a’, 


in H-measure. In the triangle 
Fic. 6 OAB ’ 


(1) cosh (c/R) = cosh (a/R) cosh (r/R) — sinh (¢/R) sinh (7/2) cos 0, 
sinh (a/R) 
sinh (c/R) 
In the triangle OBA’, 

(3) cosy = cos cos @ — sin @ cosh (r/R), 


(2) sin B = in 0. 


sin B 


(4) sinh (a’/R) = J sinh (r/R). 


These equations give the path of the reflected ray. 

The H-length of the arc MB =/=R0 sinh (r/R). We set 1 Slo, 0<00, and 
suppose 4) and /,/R are small. We will assume that r/R is small while a/R 
is large. Then approximately 


cosh (¢/R) = sinh (a/R) = $e*/*, 
Then (1) gives 
cosh (¢/R) = }4e*/® = cosh (a/R) ; therefore a = c. 


This in (2) gives 
B= 60. 
Thus (3) gives 


cos ¥ = cos? @ — sin? 6 cosh (r/R) 


2R? 2 


| 
N. 
‘B 
p 
A A’ 


1928] OPTICS IN HYPERBOLIC SPACE 


Hence 
siny = 20, or w~ = 20. 
This in (4) gives 


sinh (a’/R) = sinh (r/R) = 4(r/R), 


therefore 
a’/R=%4(7/R) or a= dr. 


Hence rays issuing from A meet at a fixed point A’. 
Let us now rotate Fig. 6 through 

a small angle about O. The point A P 

describes the arc AP while its virtual 

image describes the arc A’P’. Hence 


THEOREM. The image P’Q’ of a 
small distant object PQ in a small mirror 
of radius r, small compared with the 
space constant R, is virtual and lies at a} 
the distance 3r from the center of the 
mirror. 


5. REFRACTION ON SPHERICAL SURFACES 


In Fig. 8, a ray issues from O meeting the spherical surface at B and 
makes the angle a with the normal BN. It is bent into BH which produced 


Fic. 8 
backwards meets OA at K. Its tangent at Bis BH: 
OA=a, OB=s, OH=h, radAB=r, ine-measure ; 
OK OB=ca, OA =a, in H-measure. 


41 | 

O 

Q’ 

Ba Fic. 7 | 

K L A | 


(1) 
(2) 
(3) 


JAMES PIERPONT 


rsin 
tan = 
a@—rcos¢ 


= msina,n<1, a=6+4¢, 


sin 


s sin (a — B) 


sin 6 sin y 


In the H-triangle OBK, 


(4) 
(S) 


where 


(6) 


sinh (n/R) = 


cos x = cos (a — 8) cos? + C sin (a — 8) cos@, 
sin (a — 

(a B) 
sin x 


C = cosh (o/R), S = sinh (o/R). 


These equations give the refracted ray. 


We suppose now that ¢ is small. From (1) we have, neglecting small. 


quantities of higher order, 


(7) 


From (2), 
(8) 


(9) 
From (2), 


(10) B 
From (3), 


(11) 


a=60+¢= 


a/@ = a/r, aconstant. 


a—B=(1—n)a = ma, 


s=7¢/0=a-r. 


Thus s and hence g, its H-measure, is constant. From (4), 


where 


cos x = 1 — 3(m*a* + 6? — 2mafC) = 1 — 36°X?, 


xX? 


m?(a?/8?) + 1 — 2mC(a/®) ; 


m=1—n. 


42 (January 
Then 
= ¢=——. 
nap 


1928] OPTICS IN HYPERBOLIC SPACE 


therefore, using (9), 
(12) X? = m?(a?/r?) + 1 — 2mC(a/r), aconstant. 


Thus 
sinx = 0- X. 


From (9), (10) and (15) 


(13) inh (n/R) : tant 
sin _ 
1 ax X a constan 


Hence 


THEOREM. Neglecting small quantities of order >1, rays issuing from O 
and meeting a convex spherical surface nearly normally, meet at the virtual 
image K at a distance n given by (13) 


For x to be real, X must 20, or 
+ 


(14) C = cosh (o/R) < 
2amr 


Now as a—2R, a while the right side is finite, we have the 


_TsHeoremM. When the source moves away beyond a certain distance given 
by (14) there is no image. 


We note that when the = sign 
holds in (14) the point K is at «. ovit_Ja 

Let us displace Fig. 8 by a 
rotation about the pole of OA, 
so that A coincides with the 
origin O. Let the source be now 
at A, Fig. 9, and its image at C. 

Since distances and angles have remained unaltered we have as before 
AB =z, OK and relation (5) still holds. 

Let us now revolve Fig. 9 about O through a small angler. Then A andC 
describe small arcs AA’, CC’ of lengths 


Fic. 9 


§ = Rrsinh (w/R), 8 = Rrsinh [(w + 9)/R]. 
We may regard as the magnification of the object the quotient 


a _ sinh + 9)/R] 
sinh (w/ R) 


(15) 


F 
43 
4 
; 
3 
| 
| 
+] 
| 


44 JAMES PIERPONT 


Finally let us note that the point Z is fixed. For from (3) 


which is constant since a/@ is, by (9). We note that when the source of 
light recedes to infinity the point Z is real. 


6. LENSES 


In practical optics when one wishes to calculate the best shapes of a new 
system of lenses designed to achieve a certain object, it is necessary in the 
end to make laborious trigonometric calculations. In H-space it seems 
necessary to do this even when one wishes only approximate results since 
the theory of collineation does not apply. As in the preceding sections, so 
here, we make a favorable choice of the origin, sometimes using e-measure 
and sometimes H-measure, angles having the same measure in both 
geometries. 


Fic. 10 


In Fig. 10 we suppose the center of the lense C is at the origin O while the 
center of C’ is at A’. The source of light is at 2. A ray meets C at B and is 
bent into BB’, which as before we may suppose to be an e-straight. At B 
it is bent into B’K’ whose e-tangent at B’ is B’H’. The incident ray at B’ 
makes the angle a’, the emergent ray makes the angle 6’ with the normal 
B'N’ at B’. Then 


1 
(1) sin = —sina’. 
n 


We set 
8 =OBH, = B'HA’, ¢' = B’A'0, ¢= BOA’, & = BOA’, 
y’ B’H'O, x’ B’K'O; 


| 
[January 
msa 
06 — ma. 
- 
H 


OPTICS IN HYPERBOLIC SPACE 


OH = h, OH’ =h', OA’ =a’, OB’ =s', l= 
OB =r, A'B’ =r’, in e-measure ; 
o’ = OB’, »! = OK’ in H-measure ; 
C’ = cosh (o’/R), S’ = sinh (o’/R). 
We saw in §5 that Z is fixed when ¢ is small. We have now 
(2) sina’ = (l/r’) siny, 
r’ sin ¢’ 


3 tan 6? = ——_____ 
(3) a’ — cos 


r’ sin ¢’ 


(4) 


sin 6’ 
In the H-triangle OB’K’, 5=OB'K’ =0'+¢' ana 
(5) cos x’ = cosé cos @’ + C’ siné sin 6’, 
in 6 
(6) sinh (n'/R) = S'— 


sin x’ 


Suppose again that is small. Then 
(7) =W/r', 
(8) 

1+ 


fut 


(9) §= = gv, gconstant, 


(10) s’ = r'¢'/ = (a’ — r’)/r’, aconstant. 
Thus neglecting small quantities of order >1, s’ and hence a’, also C’, S’ 
are constants. From (5), 
cos x’ = (1 — 52/2)(1 — 02/2) + 2C’S6’ 
= 1 — + 52/6"? — 2C'3/0’) = 1 — 40"2X?, 
therefore 


sinx’ = 0X, x’ 
From (6) 


6 6S’ 
(11) sinh (n//R) = 


Now 


1928] 45 

i 

n nr” 


46 JAMES PIERPONT 


therefore 


gnr 


whence 6/6’ is constant, and therefore by (11), 7’ is constant. 
In order that x’ be real, we must have X?20 or 


SC tS’ =A= etek, 
or 
l a’ 
(12) a’ — — 2A or na’ — ————- 2 A. 
gn i-r/l 
When r’// can be neglected, this gives 
(13) => (1—n)a’ + mA. 
Hence the 


TueorEM. If the conditions (12) or (13) are satisfied, rays issuing from 
a point and meeting a convex lens nearly normally will unite in a conjugate 
point K’ determined by (11). 


7. BouGuER’sS THEOREM 


We suppose that the index of refraction m at a point x,x2x3 is a function 
of the distance of the point from the origin O. The path of a ray in this 
medium is determined by 

f nds = 0. 


i(nds) =n-ids+ds-in, in= 


a OXq 


We have 


om 


d (= 
— 

ds\\? ds 
therefore 


(January 

4 

16R? _ dite 

ye 

fie > { On + =) 

nds = —{ — — Lae 
\dxe 2 ds\x? ds 


1928] OPTICS IN HYPERBOLIC SPACE 


Thus the equations of the path are 


(1) 


The matrix 


(2) ) 
ds ds 


has three determinants D,. From (1) we find at once that 


d/n 
=o. 
ds 


n 


Da = Ga, aconstant. 


If we multiply these three equations by %, x2, x3 we get a:%:+d2%2+a3;x;=0. 
Hence 


THEOREM. The path of a ray in this medium lies in an H-plane through O. 


In our model, the radius vector r from O to the point P(x, x2, x3) has as 
direction cosines 1, =x,/r, while those of the ray are m.=dx,/do. If 6 is the 
angle between r and the ray, 

= lama. 


Now 


therefore 


Here the sum on the right is the scalar product of the matrix (2). By 
Lagrange’s theorem, 


dx2 dxa\? 


ds? 


2 


— cos? @) = r? sin? @. 


47 | 

on + 2nxXa 16R* d (= 9 ( 12,3) 

Hence 

| 

dXa X dxa 

ds 4R do 4R | 

| 

4R 

cos = —— q 

or 

j n? 16 


48 JAMES PIERPONT 


Hence 


nr 
(3) “ sin @ = constant along the ray. 


This gives Bouguer’s theorem in H;-space. 
THEOREM. The path of a ray of light in a medium whose index is a function 
only of the distance from O satisfies (3). 


If p is the length of the vector OP in H-measure we have 
sinh (p/R) 


= 2R tanh (p/2R), = 4R? sech*(p/2R), 


r 
4R 


Thus (3) becomes 
(4) n sinh (p/R) sin? = c, aconstant, 
where the quantities p, 0, 7 are now expressed in H-measure. From this 
relation we get the equation of the path of the ray. For 
(sin ds 


dp = cos @ ds = 
R sinh (p/R) 


therefore 
cos @ 


—— Rsinh (o/R), 
sin 6 
or using (4) 
do 1 


dp = (n? sinh? (p/R) — c?)!/2 


(S) 


Here n is a function of p only. 


Yare UnIversirty, 
New Haven, Conn. 


= 


GEODESICS ON SURFACES OF GENUS ZERO 
WITH KNOBS* 


BY 
DONALD EVERETT RICHMOND 


INTRODUCTION 


Poincaré has studied the geodesics upon closed surfaces of genus zero 
which are everywhere convexf. General surfaces of genus greater than one 
have been studied byH.M.Morse.{ The surfaces considered in this paper are 
of genus zero and closed, but have upon them regions of negative curvature 
as well as regions of positive curvature. They are of such a nature that the 
removal of certain portions which we call knobs leaves a region with extremal- 
convex boundaries. A remarkable subset of the geodesics issuing from any 
point not on a knob is considered, and results are obtained resembling those 
of Hadamard for geodesics on surfaces of negative curvature.§ 


ParT I. THE SURFACE 


1. The surface defined. Let us consider a closed surface homeomorphic 


with a sphere. We assume that the surface can be divided into a finite number 
of overlapping regions, such that the cartesian coérdinates x, y, z of the points 
of any of these regions can be expressed in terms of parameters u and v by 
means of functions with continuous derivatives of at least the fourth order 


and such that 
D(x, ¥) 7? D(x,2) D(y,2) 7 
[ (x (x (y #0 
D(u,») D(u,») D(u,») 
It is assumed, moreover, that the surface possesses knobs (m>1), where a 
knob will be defined as a finite portion K of the surface which is: (a) bounded 


by a closed curve C with continuously turning tangent; (b) homeomorphic 
with the interior and boundary points of a circle; (c) such that the geodesics 


* Presented to the Society, January 2, 1926; received by the editors in August, 1926. 
+ Poincaré, Sur les lignes géodésiques des surfaces convexes, these Transactions, vol. 6 (1905), 
. t Morse, A fundamental class of geodesics on any closed surface of genus greater than one, these 
Transactions, vol. 26 (1924), pp. 49-60. 
§ Hadamard, Les surfaces a courbures opposées et leurs lignes géodésiques, Journal de Mathé- 
matiques pures et appliquées, (5), vol. 4 (1896), p. 27. 


49 


{ 
| 
| 
} 
| 


50 D. E. RICHMOND [January 


tangent to the boundary C lie interior to K in the immediate neighborhood 
of the point of contact. Finally, the curves C bounding different knobs on 
the surface are assumed to have no points in common with each other. 

Let now the knobs be cut from the body of the surface along the curves C. 
There will remain a surface S with nm closed bounding curves. S will be 
extremal-convex in the sense* that, on the original uncut surface, geodesics 
tangent to the boundary of S lie outside of S in the immediate neighborhood 
of the point of contact. In this paper we consider geodesics in so far as they 
lie on S. 

2. Sufficient conditions for the existence of knobs. If any point P on a 
regular surface is taken as origin of a system of geodesic polar coérdinates, 
the element of arc takes the form 

ds? = dr*® + C*(r,6)d¢’ , 
in which r is the distance measured from P along any geodesic through P, 
and ¢ is the angle between the tangent at P to this geodesic and that to an 
arbitrary geodesic through P. Also C(0,¢)=0 and C,(0,6) =1.f 

Within the region where the geodesics ¢=constant form a field, all the 
geodesics not ¢=constant are solutions of the Euler equation for 


fir 00, 


namely 


C C 


For a geodesic which is tangent to the geodesic circle r=ro, #=0 at the point 
of contact. At this point, therefore, 


(1) = CC,. 


In order that the simply connected region bounded by r=ro be a knob 
it is necessary that for each geodesic tangent to r=ro, #<0 at the point of 
tangency. A sufficient condition is that #<0 and hence from (1), it is also 
sufficient that C(ro, ¢) >0 and C,(ro, @) <0 for all values of ¢. 

We shall deduce a sufficient condition for the existence of a knob in terms 
of the total curvature. 


* Cf. G. D. Birkhoff, Dynamical systems with two degrees of freedom, these Transactions, vol. 18 
(1917), p. 216. 
Tt Cf. Darboux, Surfaces, III, p. 157. 


ag 


1928] GEODESICS ON SURFACES WITH KNOBS 51 


Let any point P on a regular surface be taken as the origin of a system 
of geodesic polar codrdinates (7, ¢) and let K(r, @) be the total curvature of 
the surface at the point (r, ¢). 


THEOREM 1. [f there exists a constant ro>0 such that 


< new <(2) 


for any $ and for r<ro, then the set of points for which r Sr form a knob. 


By Gauss’s theorem* 


+ = 0. 


Consider any value ¢o of ¢. We assume that (2) holds for r<ro. We shall 
compare the solutions of the differential equations 


x \?2 
(a) 


ac 
ar? + K(r,¢)C = 0, 


(c) 
which have the initial conditions 


«(0) = 0, 
C(0,¢0) =0, 


2(0) = 0, 


By the use of Sturm’s comparison theorem,f it follows from (b) and (c) 
and the inequality K <(x/ro)?, that C(r, >z(r)>0 within the interval 
0<r<ro. Similarly, from (a) and (b) and the inequality (z/2r9)?<K, we 
have C(r, ¢o) <x(r) for the same interval. Since C(r, ¢o) is bounded in the 
interval, it takes on a maximum for a value r,, of r where 


0 < ta S 


* Cf. W. Blaschke, Vorlesungen tiber Differential Geometrie, I, p. 61. 
¢ Bieberbach, Theorie der Differentialgleichungen, 1923, pp. 144-5. 


| 
| 
| 
(b) 
dz \? | 

T0 
dx(0) 

—_ =1; 

60) | 
= 1 ; 
or 
dz(0) | 
—— =i. 
dr 


52 D. E. RICHMOND [January 


We shall prove that 


To 
< fo. 
2 


Suppose first that 0<r.,S70/2. From (b) and (c), omitting the arguments 
in the functions concerned, we find readily 


Integrating with respect to r from r=0 to r=rn,, we have 


[cz — 2C,]" = [x - (=) Jes dr. 


Since 2(0) C.(rm) =0, 


But C(r..) >0 and é(r,,) 20 for <70/2. Hence, a contradiction is obtained. 
Similarly, the assumption 7,, =7o leads to a contradiction. 
Therefore 


To 
— < fm < To; 
2 


as stated. Hence C(ro, do) >0 and C,(ro, <0 and by (1), #<0. Since (2) 
holds for any value ¢o of ¢, the theorem follows. 

3. Class A geodesic segments. Consider now the surface S and denote 
the m extremal-convex boundaries by ¢, c2,---,¢s. If P and Q are any two 
points within or on the boundary of S, there exists on S joining P to Q at 
least one rectifiable curve whose length furnishes a minimum with respect 
to the lengths of all rectifiable curves on S connecting P and Q.* Such a 
minimizing arc is a geodesic segment with continuously turning tangent and 
has no points in common with any boundary c,, with the possible exception 
of P and Q themselves. Such geodesic segments will be called Class A geodesic 
segments on S. 

No two class A segments on S not joining the same points can intersect 
more than once. For suppose two such segments PQ and RS intersect in 
Dand E. Then it follows from the definition of geodesics of class A that the 
segments DE of the two geodesics have the same length. In a portion of PQ 


* Bolza, Vorlesungen tiber V ariationsrechnung, 1909, pp. 422, 436. 


1928] GEODESICS ON SURFACES WITH KNOBS 53 


including DE as an interior segment, the arc DE can be replaced by its 
equal segment on RS. The resulting curve, however, has corners and hence 
can be shortened,* contrary to the assumption that PQ was a class A segment. 

4. The covering surface and linear sets. The surface S can now be 
rendered simply connected as follows: From any arbitrarily chosen point 
P on ¢c,, we can and will cut the surface along a system of class A geodesics 
In, he, Rar, leading to points on respectively. Ac- 
cording to the result of the preceding paragraph, the geodesics 4, have no 
other points than P in common. We denote by T the simply-connected piece 
of surface obtained by cutting S along the h’s. 

We now consider M, the universal covering surfacet of S, made up of an 
infinite number of copies of T. On M any two points or curves which overlie 
the same point or curve on S are said to be congruent. The boundaries of M 
are congruent to the boundaries c, ¢2,--- , c, of S. 

Let r be an integer, positive, negative or zero. Let 7, denote a particular 
copy of T on M. A linear sett of copies of T on M will be defined to be a 
region of M consisting of a set of the copies of T on M of the form 


or of the form of any subset of consecutive symbols of (1), in which each 
copy 7; of T is joined to the succeeding one along a common boundary and 
all copies are distinct. A linear set which has no first or last copy of T will be 
termed an unending linear set. 


Part II. THE cLAss A GEODESIC RAYS THROUGH A POINT 


5. Unending geodesics of class A. We have defined (§3) a geodesic 
segment joining a pair of points P and Q on S to be of class A on S if its 
length furnishes a minimum with respect to the lengths of all rectifiable 
curves on S joining P and Q. We shall now define similarly a geodesic segment 
joining a pair of points P and Q on M to be of class A on M provided its 
length furnishes a minimum with respect to the lengths of all rectifiable 


* In the regular problem of the calculus of variations, the Erdmann corner point condition is 
never satisfied. 

+ Cf. H. M. Morse, A one-to-one representation of geodesics on a surface of negative curvature, 
American Journal of Mathematics, vol. 43 (1921), pp. 35-40; H. Weyl, Die Idee der Riemannschen 
Fliche, pp. 47-53; Kerékjart6, Vorlesungen tiber Topologie, 1, pp. 158, 173-184. 

t For another point of view, cf. Oswald Veblen, Analysis Situs, The Cambridge Colloquium, 
Part 2, Chapter V. The linear set used in our paper has, however, the important metrical] property 
of being bounded by geodesic segments congruent to the boundaries of 7, in addition to the abstract 
properties of the members of the Poincaré group. 


§ 
| 
2 j 
| 


54 D. E. RICHMOND [January 


curves on M joining P and Q. Every geodesic segment of class A on S is also 
of class A on M but the converse is not true. 

Now any two points P and Q on M can be included in a region R consisting 
of a finite number of copies of T. The boundaries of R will be found in part 
among the boundaries of M, composed of segments congruent to the c;’s, 
and in part among the geodesic segments of class A on M congruent to the 
h,’s. The segments of the boundary of R congruent to the c,’s intersect the 
segments congruent to the /,’s so that the angles interior to R do not exceed 
a. The region R will still be extremal-convex in the sense of Birkhoff,* and 
it follows as in §3 that the points P and Q on R can be joined by a geodesic 
segment of class A on M which lies in R and has no points in common with the 
boundary of R with possible exception of P and Q themselves. 

We shall now define an unending geodesic of class A on M as an unending 
geodesic lying entirely on M, every finite segment of which is a geodesic seg- 
ment of class A on M. 

It follows from a discussion similar to that of §3 that no two unending 
geodesics of class A on M can intersect more than once. 

Now let g be an unending geodesic of class A on M. Clearly g cannot 
become infinite in length in any single copy of T. In leaving a copy of T, g 
cannot be tangent to any of the geodesic segments separating that copy of T 
from the remainder of M. Further, g can have only one point of intersection 
with any such geodesic segment. It follows that an unending class A geodesic 
on M is contained in one and only one unending linear set. 

Henceforth, geodesics and geodesic segments will be assumed to be on M 
unless the contrary is stated. 

By using an argument due to Morsef we obtain the following theorem: 


THEOREM 2. Given any unending linear set, there exists at least one un- 
ending geodesic of class A contained wholly in the given linear set. 


6. Semi-infinite sets and geodesic rays. Let TJ be any copy of T on M. 
A semi-infinite linear set starting from 7» will be defined by a sequence of 
copies of T that go to make up M, namely: To, 7;, T2, - - - , in which each 
copy 7; of Tis joined to the succeeding one along a common boundary and 
all copies are distinct. 

A geodesic ray of class A issuing from a point P will be defined to be a 
portion of a geodesic, in one sense unending and in the other stopping at P, 
and such that every finite segment is of class A. Corresponding to Theorem 2 
we have the following theorem for a semi-infinite linear set. 


* Loc. cit., p. 216. 
+ American Journal of Mathematics, loc. cit., pp. 47-48. 


le 
: 
Bt 


1928} GEODESICS ON SURFACES WITH KNOBS 55 


THEOREM 3. Given a semi-infinite linear set starting from a copy To, 
there exists issuing from any point P of T at least one geodesic ray of class A 
contained wholly in the given linear set. 


If the given surface has two knobs, there are only two semi-infinite 
linear sets beginning with any copy 7, of T. If, however, the number of 
knobs exceeds two, it is readily shown that the number of such sets starting 
from a given 7 has the power of the continuum. 

By Theorem 3 there exists issuing from a point P on a given copy T> of 
T at least one geodesic ray of class A belonging to each semi-infinite linear 
set beginning with that copy. For m>2, therefore, there are through P a set 
of such geodesic rays in power equal to the power of the continuum. 

Let the positive sense of any geodesic ray of class A through P be the 
sense that leads from P. The direction of any such geodesic ray will now be 
specified by the angle 0(—a <@<7) measured in an arbitrary sense about P, 
between the positive tangent to it at P and the tangent at P to an arbitrary 
geodesic through P. We shall show that the infinite set of directions @ has 
a very remarkable subset which is perfect and nowhere dense. 

7. Special and general linear sets. Any semi-infinite linear set L is 
topographically equivalent to one of the two regions of the plane bounded by 
two parallel straight lines and a transversal to them. Consider those boun- 


daries B and B’ of L which in this correspondence are topographically 
equivalent to the semi-infinite segments of the parallel lines. 

Now all the semi-infinite linear sets whose first copy of T contains P 
will be divided into two classes: 


(A) General linear sets; 

(B) Special linear sets. 

A semi-infinite linear set will be called special if either B or B’, after at most 
a finite segment, consists entirely of a boundary of M made up of a succession 
of segments which are congruent to a single one of the boundaries c; on S. 
Any set not special will be called general. 

Consider any two semi-infinite linear sets ZL and L’, beginning with 79, 
and let them be represented by the sequences of copies of T: To, 71, Tz, --- , 
and Ti, Tz,---, respectively. If the successive copies T/, TY, 
T3,---+,T, are respectively the same as 7;,7>2, - - -, 7, but if 7,4: is different 
from T,4:, the two linear sets will be said to diverge after sharing m copies of T. 
Now if B and B’ are the semi-infinite boundaries of L, it is clear that T/4, 
is joined to T,, =T, along a segment g of either B or B’ which arose from one 
of the pieces h,. The set L’ will be said to diverge from L along B or B’ 
according as g lies on B or B’. 


j 
| 
a 
| 


56 D. E. RICHMOND (January 


The general linear sets (A) and the special linear sets (B) may now be 
characterized with respect to diverging linear sets as follows: In the case of 
any general linear set L, there exist, corresponding to any positive integer m, 
linear sets which diverge from L along B and along B’ after sharing with L more 
than m copies of T. In the case of a special linear set L, for sufficiently large 
values of m, all linear sets different from Z and sharing with L more than m 
copies of T, diverge from L along either B or B’, but not both. 

8. Boundary geodesic rays. Consider now a general linear set L. Let 
91, be successive segments of B which arose from cuts Let 
be a linear set which diverges from L along q:, Lz a linear set which diverges 
from L along g: and so on. On each linear set Li, there exists at least one 
geodesic ray of class A issuing from P. Let 6;, 62, - - - be the directions at 
P of geodesic rays of class A issuing from P, chosen on J, L2,---, re- 
spectively. The set of 6’s has at least one limit angle 0. Let g be the geodesic 
ray through P with the direction ©. Then g belongs to a linear set, which 
must be L, and is itself of class A. 

If @ is measured in a proper sense about P, it will be true that for all 
integers exceeding a suitably chosen integer m, 0;<@i4:. Hence © will be 
the limit of an increasing sequence of the 0,’s. 

It may well happen that there exist in some or all of the linear sets L; 
more than one geodesic ray of class A issuing from P and belonging to L. 
It is conceivable that, if a different set of these geodesic rays were picked 
out, and correspondingly different angles 0;*, a different angle @* would be 
approached as a limit. If it is remembered, however, that no two geodesic 
rays issuing from P can intersect, it will be seen that for a proper choice of 
6=0, 0:-1<0;* <0i4:. Hence 0 =©* and the geodesic ray determined by the 
limiting process is unique. 

Similarly, let g/, g/, g3,--- be successive segments of B’ which arose 
from cuts Ay. Let Li be a linear set diverging from LZ along g/ and so on. 
If 6{, 0,--- are the directions at P of geodesic rays of class A lying on 
Li, Li,---, the set of directions 67 has a limit 0’ which defines a geodesic 
ray g’ through P. The geodesic ray g’ is a ray of class A on L and does not 
depend upon the particular selection of geodesic rays, issuing from P, made 
from Li,---. 

The geodesic rays g and g’ bound a region R in L between which there can 
lie no geodesic rays of class A issuing from P except those remaining forever 
in L. Further if g’’ be any geodesic ray of class A issuing from P and lying 
in L, and not g or g’, it must lie in this region R. For otherwise g’’ would cut 
all of the geodesic rays of L; or else all of the geodesic rays of L/ which issue 
from P with angles nearer © or 0’ respectively than the initial angle of g’’. 


1928] GEODESICS ON SURFACES WITH KNOBS 57 


The geodesic rays g and g’ will be called the boundary geodesic rays of the 
set L. In case there is only one geodesic ray of class A on L, g=g’. 

For special linear sets, we adopt a similar procedure except that only 
one boundary B or B’ contains an infinite number of segments which arose 
from cuts #,. Hence only one boundary geodesic ray is defined for a special 
linear set. 

We shall consider in the following only the boundary rays of the semi- 
infinite linear sets whose first copy of T contains P. 

9. Generalization of a theorem of Hadamard. We prove the following 
theorem. 


THEOREM 4. The set of the directions at P of all the boundary rays of class 
A issuing from any given point P of Tis perfect and yowhere dense. 


It follows from the process by which boundary rays are defined that the 
direction of each such ray is the limit of the directions of others of the same 
kind, and further that any limit direction of an infinite subset of directions 
of boundary rays is itself the direction of a boundary ray. Hence the given 
set is perfect. 

Between the boundary rays of any given linear set (if there are two), 
there are, of course, geodesic rays with intermediate directions. Such rays 
are clearly not themselves boundary rays, since there are at most two 
boundary rays in any given linear set. 

Consider, on the other hand, two boundary rays g and g’ which belong to 
different linear sets L and L’. There exists a last copy of T, say T,, which the 
two sets have in common. On the boundary of 7,, there exists between the 
pieces to which 7,4; of L and T,/,; of L’ are attached, in either cyclic order, 
at least one point Q on the boundary of M. Therefore a class A geodesic 
segment may be drawn from P to Q whose direction will be between the 
directions of the boundary rays g and g’, and which will pass off M at Q. All 
rays through P with angles sufficiently near that of the geodesic segment 
PQ will also pass off M and hence cannot be boundary rays through P. Hence 
the set of directions of boundary rays through P is nowhere dense and the 
theorem is proved. 

As a special case, this theorem reduces to a result obtained by Hadamard 
in his study of surfaces of negative curvature.* Let us suppose that the closed 
surface of genus zero with which we start is such that the removal of the 
knobs leaves a region S which is of negative curvature throughout. By virtue 
of this special condition, all geodesic rays issuing from a given point P on M 


* Loc. cit., p. 69. 


| 

| 


58 D. E. RICHMOND [January 


and remaining on M are of class A. It can be shown that there exists issuing 
from P, one and only one geodesic ray belonging to each semi-infinite linear 
set whose first copy of T contains P. 

The theorem for the special case under consideration may now be expressed 
in a form not involving the covering surface, and becomes then the theorem 
of Hadamard: 

If S is of negative curvature throughout and if P is any point on S, the set 
of the directions at P of the geodesic rays which issue from P and remain on S is 
perfect and nowhere dense. 


Part III. PERIODIC GEODESICS 


10. Definitions. We shall return to the consideration of unending 
linear sets. Let such a set be represented by the succession of copies of T, 


The symbols of this sequence may be put into one-to-one correspondence with 
a succession of symbols 


(2) ,p_2,p-1;, po, pi,p2, 


as follows. Let the 2(n—1) boundary pieces of T, which arise from the cuts 
h{i=1, 2,--+,(m—1)), be numbered in cyclic order, beginning with an 
arbitrary piece. These numbers will then be associated with the boundary 
pieces of each of the copies 7; occurring in (1). In the linear set (1), let p, be 
the number of the boundary piece of 7; to which T;4: is joined. Then (2) is 
the sequence obtained by replacing T; by px. 

Suppose now that (2) is found to consist, in both senses, of an unending 
repetition of a finite set G of successive symbols. The linear set will then be 
called periodic and G a generator of the set. Any generator of (2) which is 
made up of the smallest possible number of symbols will be called a funda- 
mental generator of (2). Any succession of copies of T in (1) which corresponds 
to a generator of (2) will be called a cycle of the linear set. A fundamental 
cycle will then be a cycle which corresponds to a fundamental generator. 

Those geodesic segments which separate successive cycles of the periodic 
linear set from each other will clearly be congruent. 

In the representation (1) of a given unending linear set L, a copy T; of 
T will be said to occur n copies later than another copy T; of T, if k—k’ =n. 
We shall now define a transformation ¢ of a periodic set L into itself, as 
follows: Suppose that each fundamental cycle of L is composed of n copies of 
T. Then under ¢ every point of each copy 7; of T is to be replaced by its 
congruent point in that copy of T which in (1) occurs m copies later. Con- 


1928} GEODESICS ON SURFACES WITH KNOBS 59 


gruent points under ¢ will be spoken of as congruent points one fundamental 
cycle apart and congruent points under ¢” as congruent points m fundamental 
cycles apart. 

11. Lemma on class A geodesic segments joining congruent points. 
We prove the following lemma. 


Lemma 1. A class A geodesic, joining congruent points m fundamental 
cycles apart on a periodic linear set, intersects itself on the original surface in 
such a manner, that if cut at these points of intersection, the resulting geodesic 
segments can be regrouped, reordered, and rejoined on M so as to make up m 
curve segments each joining congruent points one fundamental cycle apart. 


Let R be the region formed by m>1 successive fundamental cycles of a 
periodic linear set. Let C be a class A geodesic segment connecting con- 
gruent points A and B on the boundaries }, and 6, of R which separate it 
from the remainder of the linear set. We shall first prove that there exist on C 
at least two congruent points one fundamental cycle apart. Let C’ be the 
curve segment determined by all the points congruent to C under #, the 
transformation of the preceding paragraph. If C and C’ have a point in 
common on R, the point in common on C, if considered also as a point on C’, 
appears clearly congruent to a point on C one fundamental cycle previous. 
We must show then that C and C’ do have a point in common on R. NowC 
divides R into two regions R; and R2, having no other points in common than 
those which belong to C. We denote by Po, P:, - - - , P» the intersections of 
C with the geodesic segments separating different cycles of R from each other 
and from the remainder of the linear set, the geodesic segments being taken 
in the order of progression along the set. C’ does not intersect the first 
such boundary and extends beyond R in the positive direction of the set. 
Its intersections with the geodesic boundaries of successive cycles will be 
denoted by P/, P/,-- , 

If C and C’ have a point in common, the proof is complete. In the contrary 
case, we assume that P/ lies in R, but not on C. Then C’ crosses the boundary 
of R2 either along C or b.. In the former case, the proof is again complete. 
In the later case, the point P,” on bz lies in Re. Let 51, 52, - - - , Sm be the dis- 
tances along the geodesic segments separating cycles of the linear set, 
measured from the points where these segments enter R, from outside the 
linear set to Pi, P:,---, Pm, respectively. Let s/, s¢,--+-,5m be similarly 
defined relative to P/, P/,---, Ps. We have then s,, <s,,, by hypothesis. 
Since P,, is congruent to Pm, we have Sw =Sm—1 and hence Sm1<Sm. 
Similarly, sm_1<Sm-1, Whence Sm2<Sm-1. Proceeding in this manner, we 
obtain the conditions 


D. E. RICHMOND [January 


So < 51 < < Sm-1 < Sm, 


as the only possible case under which the proof might fail. But P» and P,, 
are congruent and therefore so=s,,. Hence this set of conditions cannot be 
satisfied and C and C’ have a point in common. There txist therefore on C 
two congruent points one fundamental cycle apart. 

Let now P and Q be the points of C one fundamental cycle apart whose 
existence has just been proved. Let AP and QB be the segments of C which 
respectively precede and follow PQ in the order following the linear set. 
Now consider on R the curve C; which consists of the segment QB and a 
segment congruent to AP under the transformation ¢. Then C; joins points 
on R which are (m—1) fundamental cycles apart and intersects each geo- 
desic segment separating successive cycles once and only once. The pro- 
cedure of the preceding paragraphs will now show that, if m>2, C, contains 
a subsegment joining congruent points one cycle apart. Repetition of the 
process m times gives the conclusion. 

12. Existence of periodic geodesics of class A. An unending periodic 
geodesic g is defined as an unending geodesic composed of successive con- 
gruent segments. Such a geodesic necessarily lies on an unending periodic 
linear set L, and overhangs a closed geodesic on the surface S. A segment of 
g whose length is that of this closed geodesic on S will be called a fundamental 
segment of g. 


Lemma 2. The length of a class A geodesic segment on M is a continuous 
function of its end points. 


Suppose PQ and P’Q’ are two class A geodesic segments such that the 
distance PP’ =e, and the distance 0Q’=e:, where and are arbitrarily 
small positive numbers and the distances PP’ and QQ’ are measured along 
class A geodesic segments. If d and d’ are respectively the lengths of PQ 
and P’Q’, it follows from the class A character of PQ that 


i.e., 
d’ =>d — + e2). 


Likewise from the class A character of P’Q’ it follows that 


sd+(e,+ 2). 
Combining, 


d—(e +e) Sd’ Sd+ (e+e). 


The inequalities express the stated continuity property. 


1928) GEODESICS ON SURFACES WITH KNOBS 61 


THEOREM 5. Corresponding to any unending periodic linear set there exists 
on the set at least one unending periodic geodesic of class A. 


Let D be a fundamental cycle of the given linear set. Each pair of con- 
gruent points on the boundaries }, and 6, separating D from the remainder 
of the linear set can be joined by a class A geodesic segment lying entirely 
in D. It follows from Lemma 2 that the length of such class A segments is 
a single-valued continuous function of an end point on either }; or b:. Since 
the domain of such an end point is closed, there exists a segment AB whose 
length /,, gives a minimum among the lengths of all class A geodesics joining 
congruent points on b, and b:. The segment AB will be proved to be a funda- 
mental segment of a periodic geodesic of class A. 

Choose arbitrarily a positive direction along 5; and a corresponding 
congruent positive direction along b2. Then AB makes the same angle with 
b, as it does with b:. Otherwise, on the surface obtained by “healing” J; 
and 6, together, AB would have a corner and could be shortened on this 
surface. Recutting the surface to form D, the shortened curve would connect 
congruent points on }, and be. The class A geodesic segment joining these 
same two congruent points would be shorter than AB, contrary to the sup- 
position that AB was the shortest class A segment connecting congruent 
points on 5, and by. 

It now follows that in the given unending periodic set of which D is a 
fundamental cycle, the curves congruent to AB in the successive cycles join 
on to each other continuously to form an unending periodic geodesic g of 
which AB is a fundamental segment. 

We shall prove by the aid of Lemma 1 that g is of class A. 

Assume, on the contrary, that there exist on g two points P and Q which 
can be connected on M by a curve shorter than the segment PQ of g. Then 
there can be found on g two points A and B, m fundamental cycles apart, 
containing P and Q between them and lying on geodesic segments separating 
cycles of the linear set from each other, where m is a properly chosen positive 
integer. 

On g, the segment AB has the length ml,,. Now join A and B by a geodesic 
segment C of class A and length C. By hypothesis, C<_ml,,. Hence at least 
one of the m segments of C of Lemma 1 is necessarily less than /,, in length. 
It would follow that a pair of congruent points could be found on the 
geodesic boundaries of a fundamental cycle, for which the shortest con- 
necting path on M would be in length less than /,,, contrary to the definition 
of /,. Therefore g is of class A and the theorem is completely proved. 

It follows from the theorem just proved that there exist on an extremal- 


62 D. E. RICHMOND 


convex surface S with at least three boundaries, an enumerably infinite 
number of closed (or periodic) geodesics. In particular, there exist closed 
geodesics deformable into each of the boundaries of S. 

13. Class A geodesics of the same type. In the remainder of the paper 
we will state without proof the consequences of the supposition that there 
exist two or more class A periodic geodesics of the same type, that is, be- 
longing to the same linear set. 

It is necessary that all such periodic geodesics of the same type have 
fundamental segments of the same length. 

No two class A periodic geodesics, g; and gz, of the same type can intersect, 
for if they intersected once, they would intersect an infinite number of times, 
contrary to a fundamental property of class A geodesics. Hence g; and gz 
separate a ribbon-like region R from the remainder of the set. We suppose 
that on R there exist no other periodic geodesics of class A. 


THEOREM 6. If a class A geodesic g lies completely on a region R bounded 
by two class A periodic geodesics g, and gz and there exist on R no class A 
periodic geodesics other than g, and go, then g is either asympiotic to g, in its 
positive sense and to go in its negative sense or it is asymptotic to g. in its positive 
sense and g, in its negative sense, where the positive sense of g is taken to be that 
sense which follows the linear set to which g belongs. 


The results for geodesic rays are similar. 


THEOREM 7. There exist issuing from any point P between g, and gz at 
least four geodesic rays of class A, respectively 
(a) positively asymptotic to g, 
(b) negatively asymptotic to gi, 
(c) positively asymptotic to ge, 
(d) negatively asymptotic to go. 
Moreover, any geodesic ray of ciass A issuing from P belongs to one of these 
four classes. 


CorNELL UNIVERSITY, 
Irmaca, N. Y. 


CONCERNING END POINTS OF CONTINUOUS CURVES 
AND OTHER CONTINUA* 


BY 
HARRY MERRILL GEHMANT 


1. INTRODUCTION 


Several authors have given definitions of an end point of a continuum, 
making use of properties which are possessed by an end point of a straight 
line interval, but which are not possessed by any interior point of the interval. 
We shall show in the present paper that the properties used in these so-called 
“definitions” are not logically equivalent, and shall determine the logical 
relations which do exist among these properties under various conditions. In 
Part 2, we shall show the equivalence of a number of these properties in the 
case of a continuous curve. In Part 3 is shown the equivalence with the 
first set, of two additional properties in the case of a continuous curve of a 
special type. In Parts 4 and 5, we determine the logical relations existing 
among certain of these properties in the case of a bounded continuum. Fin- 
ally, in Part 6, some theorems are proved concerning points having one or 
more of the given properties. 

In this paper we shall consider only plane point sets, although in a 
number of cases it is obvious that our results are true in space of any number 
of dimensions. 

In regard to the use of the word “end point,” we intend hereafter to use 
this word only in the sense of a point of a continuous curve satisfying Wilder’s 
definition or one of the other definitions equivalent to it, in other words, a 
point having any one of the properties 1—7 given in Part 2. We shall not use 
the word “end point” in referring to a point of a continuum which is not 
a continuous curve. The examples of Part 4 show that a point of a con- 
tinuum may have certain of the given properties and yet be so placed with 
respect to that continuum as hardly to deserve the name of “end point.” 


2. CONCERNING PROPERTIES 1-7 FOR A CONTINUOUS CURVE 


The object of this section is to prove the equivalence of the following 


* Presented to the Society, April 2 and September 9, 1926; received by the editors August 10, 
1926. 
+ National Research Fellow in Mathematics. 


63 


‘ 


64 H. M. GEHMAN [January 


properties of a point of a continuous curve.* In each case, P denotes a point 
of a continuous curve M. 

Property 1. If PP’ is any arc in M whose end points are P and any 
other point P’ of M, then the set M—(PP’—FP) contains no connected 
subset consisting of more than one point which contains P. 

Property 2. If PP’ is any arc in M whose end points are P and any 
other point P’ of M, then P is not a limit point of any connected subset 
of M—PP’. 

Property 3. If N is any subcontinuum of M containing P, then the 
set M—(N—P) contains no connected subset consisting of more than one 
point which contains P. 

Property 4. P is not a cut pointf of any subcontinuum of M. 

Property 5. P is not contained in any subcontinuum of M which is 
irreduciblef between two other points of M. 

Property 6. If NW is any subcontinuum of M containing P, then P is 
not a limit point of any connected subset of M—N. 

PRopERTY 7. Given any positive number e, there exists a domain con- 
taining P of diameter less than e, whose boundary has just one point in 
common with M. 

Property 1 is due to R. L. Wilder.§ Properties 2, 3, and 6 are modifications 
of property 1. Property 4 was suggested by Professor R. L. Moore. Property 
5 is due to Yoneyama.|| Property 7 is due to Menger.1 

G. T. Whyburn** has proved that in the case of a continuous curve, 
property 1 is equivalent to the following property: No arc in M contains 
P as an interior point. In the case of a'continuum which is not a continuous 


* For a number of equivalent definitions of a continuous curve, see R. L. Moore, Report on 
continuous curves from the viewpoint of analysis situs, Bulletin of the American Mathematical Society, 
vol. 29 (1923), pp. 289-302. In the present paper we shall make one change in the definitions given 
_ in Report, i.e., we shall define a continuum as a closed and connected point set containing more than 
one point. 

t If M is a connected point set, and P is a point of M, then if M—P is not connected, P is said 
to be a cut point of M; if M—P is connected P is said to be a non-cut point of M. 

1A point set K is said to be an irreducible continuum between two points A and B, if K is a con- 
tinuum and contains A and B, but contains no proper subset which is a continuum and contains 
A and B. 

§ R. L. Wilder, Concerning continuous curves, Fundamenta Mathematicae, vol. 7 (1925), pp. 340- 
377. See especially p. 358. 

| K. Yoneyama, Theory of continuous set (sic) of points, Téhoku Mathematical Journal, vol. 13 
(1918), p. 130. 

{K. Menger, Grundziige einer Theorie der Kurven, Mathematische Annalen, vol. 95 (1925), 
pp. 277-306. 

**G. T. Whyburn, Concerning continua in the plane, these Transactions, vol. 29 (1927), pp. 369- 
400. See Theorem 12, p. 385. 


1928] END POINTS OF CONTINUA 65 


curve, Whyburn uses property 6 as a definition of an end point of the 
continuum. 

W. L. Ayres* has proved that in the case of a continuous curve, property 
1 is equivalent to the following property: P is a non-cut point of M which 
belongs to no simple closed curve in M. 


THEOREM 1. If a point P of a continuous curve M has any one of the 
properties 1-7, it has all the others. 


In Part 4 of this paper, we shall show that if M is any bounded continu- 
um, and P has property 7, it has property 6; if P has property 6, it has 
property 5; if P has property 5, it has property 4. 

If P has property 4, it has property 2. For if it fails to have property 2, 
then M contains an arc PP’, such that P is a limit point of a connected sub- 
set X of M—PP’. Let K denote the maximal connected subset of M—PP’ 
containing X. The point P can be joined to any point P” of K by an arct 
lying in K except for the point P, and therefore having only P in common 
with the arc PP’. The sum of the arcs PP’ and PP” is an arc P’P” of which 
P is an interior point, and therefore a cut point. But this is contrary to the 
assumption that P has property 4. 

If P has property 2, it has property 1. For if it fails to have property 1, 
then there is some arc PP’ in M, such that M—(PP’—P) contains a con- 
nected subset X containing P and such that X—P is not vacuous. Let K 
be the maximal connected subset of M—(PP’—FP) containing P. Since 
K contains X, the set K—P is not vacuous. If Q is any point of K—P, then 
K contains the maximal connected subset of M—PP’ containing Q. Let us 
denote this set by D. Since the point P has property 2, P is not a limit point 
of D, and therefore K contains other points besides P and points of D. 
The set D is closed, save for limit points on PP’, and since the point P is 
not a limit point of D, no point of K—D is a limit point of D. Also, no 
point of D can be a limit point of any set of points of M not in Df, and 
therefore no point of D is a limit point of K—D. Therefore K is disconnected, 
which is contrary to our supposition concerning K. 

If in the argument in the preceding paragraph, we replace the arc PP’ 
by a subcontinuum WN of M containing P, we can prove that if P has property 
6, it has property 3. Obviously if P has property 3, it has property 6, and 


* W. L. Ayres, Concerning continuous curves and correspondences, Annals of Mathematics, (2), 
vol. 28 (1927), pp. 396-418. See Theorem 3, p. 399. 
¢ R. L. Wilder, loc. cit., Theorem 1, p. 342. 
1 R. L. Wilder, loc. cit. See the proof of Theorem 9, p. 360. 


66 H. M. GEHMAN [January 


therefore properties 3 and 6 are equivalent. To complete the proof of 
Theorem 1, we need only show that if P has property 1, it has property 7. 

If P has property 1, and PP’ is any arc in M, then P is a limit point of a 
set of points L of PP’—(P+P’), such that if X is any point of L, then 
PX-—X and P’X —X lie in different maximal connected subsets of M—X. 
For suppose this is not true, and the arc PP’ contains a subarc PQ, such that 
for every interior point X of PQ, the sets PX —X and P’X —X lie in the same 
connected subset of M—X. Then, for each point X, there is an arcin M—X 
joining P to Q, and this arc contains as a subset an arc AB, which has only 
its end points in common with PQ, and which is such that A is either Q or 
an interior point of the arc QX, and such that B is an interior point of the 
arc XP. Also, the point P is not a limit point of the collection of maximal 
connected subsets of M—PQ that have limit points on XQ, because P 
is not a limit point of any one of them (since it has property 1), and only a 
finite number of these sets can be of diameter greater than half the distance 
from P to the nearest point of XQ.* Therefore there is a last point (necessar- 
ily different from P) on the arc XP to which an arc AB, of the type described 
above, can be constructed. We have therefore shown that corresponding 
to any point X, an arc AB as described above can be constructed having the 
additional property that no interior point of the arc BP can be joined to a 
point of XQ0—X by an arc having only its end points in common with PQ. 

Let us then select a point Bo, which is an interior point of the arc PQ. 
Let A,B, be an arc corresponding to By. The point A; is a point of By O—Ba, 
and B, is an interior point of PBo. Let A,B, be an arc corresponding to By. 
The point A: is a point of B,B,—B,, and B, is an interior point of PB,. 
Continuing this process, for nm = 2, there exists an arc corresponding 
to B,, where A,4,; is a point of B,B,..—B,, and B,4; is an interior point of 

Any set of points Bo, B,, Bz, - - - on an arc PQ, and such that B; follows 
Bi4:, must have a sequential limit point on PQ. We shall show that under 
the given conditions, this sequential limit point must be the point P. For 
suppose a sequence of this type has a point C different from P as a sequential 
limit point. Then there exists an arc A’B’ corresponding to C, where A’ is a 
point of CO—C, and B’ is an interior point of PC. Some point of the sequence 
Bo, Bi, Bz, - - - , say B,, is an interior point of the arc CA’. By our method of 
constructing the arc A,+;; B,y: corresponding to B,, no interior point of 
B+: P can be joined to a point of B,Q—B, by an arc having only its end 
points in common with PQ. But B,4: lies between C and A’, and therefore 


* W. L. Ayres, loc. cit., Theorem 1, p. 396. 


8 
ay 
q 
3 
a 
3 
4 


1928] END POINTS OF CONTINUA 67 


2’ is an interior point of B,.,P, while A’ is a point of B,.0—B,. The existence 
of the arc A’B’ shows that our given method of construction was not followed 
in constructing the arc A,4: B,4:. Having arrived at this contradiction by 
supposing C to be the limit of the sequence, it follows that P is the sequential 
limit point of any sequence Bo, B,, Bz, - - - , obtained as described above. 

The continuum consisting of the arc PQ and the sequence of arcs A,B,, 
A,B, A;3B;3, - - - is a continuous curve, every subcontinuum of which is a 
continuous curve.* Let M, denote the continuum consisting of P, the arcs 
Aonsi (1=0, 1, 2,- +--+), and the arcs Bans: Aonys (n=0, 1, 2,---) 
of the arc PQ. Let Mz, denote the continuum consisting of P, the arcs 
Aon Bon (n=1, 2,3, - +--+), and the arcs Bo, Aons2 (m=1, 2, 3, - - - ) of the arc 
PQ. These two continua have only the point P in common, and since each 
is a continuous curve, we can construct in each an arc having P as an end 
point. But in that case P fails to have property 1, which is contrary to hy- 
pothesis. We have thus established that if P has property 1,any arc PP’ 
contains a set of cut points of M having P as a limit point, each of these 
points X being such that PX —X and P’X —X lie in different maximal con- 
nected subsets of M—X. 

Let PP’ be an arc in M, and let Pi, Ps, Ps, - - - be a sequence of points 
of PP’ which are cut points of M of the type described above, and whose 
sequential limit point is P. Given any positive number «©, we can select a 
positive number e, such that ¢ is less than «, and is less than the distance 
from P to P’. The number of maximal connected subsets of M—PP’ of 
diameter greater than ¢/6 is finite, and since P is not a limit point of any 
one of these sets, we can select an integer m such that the diameter of PP, 
is less than ¢/6, and such that no maximal connected subset of M—PP’ 
of diameter greater than ¢«/6 has any limit points on PP,. If we denote by 
N the maximal connected subset of M—P, that contains P, it follows that 
the diameter of N is less than e/2. The continuum N+P, cannot contain a 
simple closed curve enclosing the set P,P’ —P,, otherwise the diameter of V 
would be greater than the distance from P to P’, which is impossible. 

If no simple closed curve in M—WN encloses a point of N, then if we add 
to N+P, all points of the plane which are interior to a simple closed curve 
in N+P,, we obtain a continuum K which does not separate the plane. Let 
H, denote the continuum consisting of all points of M+K—(K-—P,), and 
let H denote the continuum consisting of H; and all points of the plane which 
are interior to a simple closed curve in H;. The two continua K and H 


* H. M. Gehman, Some conditions under which a continuum is a continuous curve, Annals of 
Mathematics, (2), vol. 27 (1926), pp. 381-384. See especially Theorem 2, p. 382. 


4 
3. 
a 
4 
a 


68 H. M. GEHMAN [January 


satisfy certain conditions* under which there exists a simple closed curve 
enclosing K —P, but not enclosing any other points of K+H, and containing 
P,, but not containing any other points of K+H. Therefore there exists a 
simple closed curve enclosing N, not enclosing P,P’—P,, and having only 
P, in common with M. 

In case a simple closed curve in M—WN encloses a point of N, it encloses 
all of N. Then, by Theorem 3 of S.P.S., there exists a simple closed curve 
having the properties mentioned at the end of the preceding paragraph. 

Let us denote by J the simple closed curve enclosing N. In case J is of 
diameter less than ¢, its interior is the domain required in order that P have 
property 7. In case J is of diameter greater than e, we shall show how to 
replace J by a simple closed curve whose diameter is less than ¢, which en- 
closes N, and which has only P, in common with M. Let us denote by J the 
interior of J. 

Let us denote by K, the continuum consisting of all points of M in J+/, 
and all points of the plane which are interior to a simple closed curve of 
M in J+J. Then by Theorem 1 of S.P.S., there exists a simple closed curve 
L which encloses K and is such that every point of L plus its interior is at 
a distance less than ¢«/4 from some point of K. By the way in which the 
point P, was selected, the diameter of K is less than ¢«/2, and therefore the 
diameter of L is less than e. Since the diameter of J is greater than e, J 
is not entirely contained in L plus its interior, and therefore L contains some 
points which are interior to J. The two simple closed curves J and L satisfy 
the conditionst under which there exists a simple closed curve J’ which is a 
subset of J+, which contains an arc through P,, and every point of whose 
interior is interior to both J and L. The simple closed curve J’ has only P, 
in common with M, because J’ is a subset of J plus that portion of ZL which 
is interior to J, and this portion of Z has no points in common with M. 
Furthermore, since every point of the interior of J’ is a point of the interior 
of L, the diameter of J’ cannot be greater than the diameter of L, that is, 
the diameter of J’ is less than e. The existence of the simple closed curve J’ 
shows that P has property 7, as the interior of J’ is the domain required in 
order that P have property 7. This completes the proof of Theorem 1. 

DEFINITION. A point of a continuous curve which has properties 1-7 
is said to be an end point of the continuous curve. 


*R. L. Moore, Concerning the separation of point sets by curves, Proceedings of the National 
Academy of Sciences, vol. 11 (1925), pp. 469-476. See Theorem 2, p. 470. We shall refer to this 
paper hereafter as S.P.S. 

¢ R. L. Moore, On the Lie-Riemann-Helmholiz-Hilbert problem of the foundations of geometry, 
American Journal of Mathematics, vol. 41 (1919), pp. 299-319. See especially Theorem 26, p. 311. 


4 


1928] END POINTS OF CONTINUA 69 


Note that in the course of the proof of Theorem 1, we have also proved 
the following theorem: 


THEOREM 2. An end point P of a continuous curve M has this property: 
given any positive number e, there exists a simple closed curve enclosing P, 
of diameter less than ¢, and having just one point in common with M. Conversely, 
if a point P of a continuous curve M has the above property, then P is an end 
point of M. 


3. CONCERNING PROPERTIES 8-9 FOR A CONTINUOUS CURVE 


We shall now introduce two additional properties of a point P of a con- 
tinuous curve M. 

Property 8. If P’ and P” are any two points of M different from P, 
then any two subcontinua of M irreducible between P and P’, and between 
P and P”, respectively, have in common a continuum containing P. 

Property 9. There exists a positive number x, such that if P’ and P”’ 
are any two points of M at a distance less than x from P, then one of any 
two subcontinua of M irreducible between P and P’, and between P and P”, 
respectively, contains the other. 

Properties 8 and 9 are due to Yoneyama.* A property analogous to 
property 9 has also been given by Young.f 

In Part 4 of this paper, it is shown that if M is any bounded continuum, 
and P has property 9, it has property 8; and if P has property 8, it has pro- 
perty 4. Therefore, in the case of a continuous curve, if P has property 8, 
it has properties 1-7. The following examples will show (1) that a point P 
of a continuous curve M may have properties 1-7, without having properties 
8 or 9; (2) that a point P of a continuous curve M may have properties 1-8, 
without having property 9. 

EXAMPLE 1. Let M consist of the point (0, 0), and the circles with 
center at (3/2*, 0) and with radius equal to 1/2", for m=1, 2, 3,---, and 
let P be the point (0, 0). The point P has properties 1-7, but not properties 
8 and 9. 

EXAMPLE 2. Let M consist of the straight line intervals between (0, 0) 
and (1, 0), and between (1/2", 0) and (1/2", 1/2"), for n=1, 2, 3,---, 
and let P be the point (0, 0). The point P has properties 1-8, but not proper- 
ty 9. 


* K. Yoneyama, On continuous set (sic) of points, II, Téhoku Mathematical Journal, vol. 18 
(1920), p. 254. 
t W. H. Young and G. C. Young, The Theory of Sets of Points, 1906, p. 220. 


| 
4 


H. M. GEHMAN [January 


THEOREM 3. A necessary and sufficient condition that a point P of a 
continuous curve M have property 8, is that P have properties 1-7, and that any 
arc PQ of M contains a sub-arc PX, every interior point of which is a cut point 
of M. ° 


Let P be an end point of a continuous curve M such that if PQ is any 
arc of M, then PQ contains a sub-arc PX every interior point of which is a 
cut point of M. This is equivalent to the statement that if PQ is any arc of 
M, the point P is not a limit point of those maximal connected subsets 
of M—P0Q that have more than one limit point on PQ. We shall now show 
that P also has property 8. 

Let us select a definite point Q and a definite arc PQ. Let ¢€ be a positive 
number which is less than the distance from P to Q, and less than the distance 
from P to any point of a maximal connected subset of M—PQ that has more 
than one limit point on PQ. By Theorem 2, we can construct a simple closed 
curve J enclosing P, of diameter less than e, and having just one point X 
in common with M. The point X is necessarily an interior point of the arc 
PQ. 

If P’ is any point exterior to J, any subcontinuum of M which is ir- 
reducible between P and P’ must contain X, and therefore can be expressed 
as the sum of two continua irreducible between P’ and X and between X 


and P, respectively, and having only X in common. We shall now show that 
any subcontinuum of M irreducible between P and any point Y lying in J 
plus its interior, has an arc in common with the arc PX of the arc PQ. 


Note that under our given conditions, every interior point of the arc 
PX is a cut point of M. Therefore if Y is any point of PX —P, any con- 
nected subset of M containing Y and P necessarily contains all points of the 
arc PY, and therefore any subcontinuum of M irreducible between P and Y 
must coincide with this arc. If Y is a point of a maximal connected subset 
of M— FQ in the interior of J, any connected subset of M containing Y 
and P must contain the point Z which is the limit point on PQ of the maximal 
connected subset of 4 —PQ that contains Y. Since P has properties 1-7, 
Z is different from P. As before, any connected subset of M that contains Z 
and P must contain the arc PZ, and therefore any subcontinuum of M 
irreducible between P and Y has the arc PZ in common with PQ. 

Therefore if P’ and P” are any two points of M, each one of any two 
subcontinua of M irreducible between P and P’, and between P and P” 
respectively, has an arc containing P in common with PQ, and the common 
part of these two arcs is the continuum required in order that P have 
property 8. Therefore the condition is sufficient. 


{ 
70 


1928] END POINTS OF CONTINUA 71 


Suppose now that a continuous curve M contains a point P which has 
properties 1-8, and suppose that for some arc PQ, the point P is a limit point 
of non-cut points of M on PQ, and is therefore also a limit point of maximal 
connected subsets of M—PQ which have more than one limit point on PQ. 
Let us select a sequence D,, D2, D3, - - - of maximal connected subsets of 
M —P@Q having P as a limit point, such that D; has at least two limit points 
on PQ, and such that every limit point of D;,; on PQ lies between P and each 
limit point of D; on PQ. The sequence can be selected so as to satisfy this 
latter condition, because P is not a limit point of any one of the sets D; 
(because P has properties 1-8), and because the number of maximal con- 
nected subsets of M—PQ of diameter greater than any given positive 
number is finite. 

If A; and B; are two points of PQ which are limit points of D;, an arc 
can be constructed from A; to B; in the set D;+A;+B;. The set of arcs 
A.B; (t=1, 2, 3, - - - ) plus the set of arcs B;A i4; (i=1, 2, 3, - - - ) of the arc 
PQ, plus the point P, is a continuous curve, by the argument given in the 
proof of Theorem 1, and therefore this set contains an arc from A, to P. 
However this arc A,P and the arc PQ do not have in common a continuum 
containing P, and therefore P fails to have property 8, which is contrary to 
hypothesis. Therefore the condition is necessary. 


‘Coroiiary 3a. If a point P of an acyclic* continuous curve has any 
one of the properties 1-8, it has all the others. 


THEOREM 4. A necessary and sufficient condition that a point P of a con- 
tinuous curve M have property 9, is that if PQ is any arc of M, then P is not 
a limit point of M—PQ. 


Let P be a point having the given property, and let us select a definite 
arc PQ of M. Then since P is not a limit point of M —P(, there exists a point 
Y of PQ, such that no point of the arc PY is a limit point of M—PQ. Let x 
be a positive number which is less than the distance from P to any point of 
M-—PY. If P’ is any point of M at a distance less than x from P, the point 
P’ is a point of the arc PY. Any connected subset of M containing both P 
and P’ must contain the arc PP’, and therefore the only subcontinuum 
of M which is irreducible between P and P’ is the arc PP’. Therefore if P’ 
and P” are any two points of M at a distance less than x from P, one of the 
two arcs PP’, PP” will contain the other, and therefore P has property 9, 
and the condition is sufficient. 


* An acyclic continuous curve is a continuous curve containing no simple closed curve. 


72 H. M. GEHMAN [January 


The condition is necessary, for suppose P has property 9 (and therefore 
properties 1-8), and yet there is an arc PQ of M such that P is a limit point 
of M—PQ. Then if we select any positive number x, there are points of 
two different maximal connected subsets of at, a distance less than 
x from P, for if there were only one, P would fail to have properties 1-9. 
Let these sets be D; and D,. If P; is a point of D;, an arc P;P can be con- 
structed in D;+PQ and evidently neither of the arcs P,;P, P:P contains the 
other, contrary to our hypothesis that P has property 9. 


Corottary 4a. If a point P of an arc has any one of the properties 1-9, 
it has all the others. 


To recapitulate: 


THEOREM 5. For a point of a continuous curve, properties 1-7, Whyburn’s 
property, Ayres’ property, and the property mentioned in Theorem 2 are 
equivalent; property 8 is stronger than any of these; and property 9 is stronger 
than property 8. For a point of an acyclic continuous curve, properties 1-8 
are equivalent, and property 9 is stronger than any of them. For a point of an 
arc, properties 1-9 are equivalent. 


4. CONCERNING PROPERTIES 4-9 FOR A BOUNDED CONTINUUM 


In this part, we shall consider the logical relations between properties 
4-9, for the case where M is bounded continuum. 


THEOREM 6. If a point P of a bounded continuum M has property 9, it has 
property 8. 


Suppose P has property 9, but not property 8. Then there exist two 
points P’ and P” of M, and two subcontinua WN’, N” of M irreducible between 
P and P’, and between P and P”, respectively, but which do not have in 
common any continuum containing P. Since P has property 9, we can select 
a point Q’ of N’, and a point Q” of N” sufficiently close to P that one of any 
two subcontinua of M irreducible between P and Q’, and between P and Q”, 
contains the other. Since Q’ and P are points of N’, the set N’ contains a 
subcontinuum K’ which is irreducible between Q’ and P, and similarly 
N” contains a subcontinuum K” which is irreducible between Q” and P. We 
have shown above that one of the two continua K’, K” contains the other. 
Suppose K’ contains K”’. If so, the two continua N’ and NV” have in common 
the continuum K” which contains P, which is contrary to our supposition 
that NV’ and N” do not have in common a continuum containing P. 


| 
| 
> 


1928] END POINTS OF CONTINUA 73 


THEOREM 7. If a point P of a bounded continuum M has property 8, 
it has property 4. 


Suppose P has property 8, but not property 4. Then, M contains a sub- 
continuum WN containing P, such that N --P is not connected. Let H,:+A; 
be any method of expressing M,—P as the sum of two sets having no points 
in common and neither containing a limit point of the other. Then H,+P 
and H.+ PP are two subcontinua of M having only P in common. The set 
H,+P contains a subcontinuum irreducible between P and any point P’ 
of H,+P, and H.+FP contains a subcontinuum irreducible between P and 
any point P” of H.+P. These two continua have only P in common, and 
therefore P does not have property 8, contrary to hypothesis. 


THEOREM 8. If a point P of a bounded continuum M has property 7, it 
has property 6. 


Suppose P has property 7, but not property 6. Then there is some sub- 
continuum WN of M containing P, such that P is a limit point of some con- 
nected set L which is a subset of M—N. Let X be a point of N—P, and Ya 
point of Z. Let ¢ be less than the distance from P to X, and less than the 
distance from P to Y. Since P has property 7, there exists a domain D which 
contains P, whose exterior contains both X and Y, and whose boundary has 
only one point Q in common with M. The connected set N contains the point 
P in D, and the point X exterior to D, and therefore contains a point of the 
boundary of D. Therefore Q is a point of VN. But the connected set L+P 
also contains the point P in D, and the point Y exterior to D, and therefore 
Q is also a point of M—WN. But it is impossible for Q to be both a point of N 
and a point of M —N, and therefore if P has property 7, it also has property 6. 


THEOREM 9. If a point P of a bounded continuum M has property 6, it 
has property 5. 


Suppose P has property 6, but not property 5. Then M contains a sub- 
continuum W containing P, which is irreducible between two points P’, 
P” different from P. Let K be any subcontinuum of N containing P, but not 
P’ or P”, and let D’ be the maximal connected subset of N —K that contains 
P’. Since P has property 6, P is not a limit point of D’, and therefore if D’ 
also contained P”, the continuum consisting of D’ and its limit points in K 
would be a proper subcontinuum of WN containing P’ and P”, which is im- 
possible. Therefore P” lies in a maximal connected subset of N —K different 
from D’, and we shall denote the one in which P” lies by D’”’. The set 
K+D’+D" is a continuum which is a subset of N and contains P’ and P”, 


‘ 


74 H. M. GEHMAN [January 


and must therefore be identical with VN. Therefore N—K consists of two 
and only two maximal connected subsets, one of which contains P’, and the 
other P”’. Let us denote by £’ and E” the sets of points of K which are limit 
points of D’ and D” respectively. Since P is not a point of either E’ or E”, 
the sets E’, E” have no points in common. For if they had points in common, 
D’+£'+D"+E” would be a proper subcontinuum of N containing both 
P’ and P”, which is impossible. Also D’+£’ is irreducible between P’ and 
any point of EZ’, and D’ +£” is irreducible between P” and any point of E”, 
while K is irreducible between some point of EZ’ and some point of E”. 

Let us select a sequence of subcontinua of V:Ni, Ne, Ns, ---, such 
that for i=1, 2, 3,---, (1) Ni contains Ni4;, (2) N; contains P, but not 
P’ or P”, (3) the diameter of N; is less than 1/i. For each of the sets Ni, 
let D{ and D{’ denote the maximal connected subsets of VN — NV; containing 
P’ and P” respectively. The sequence of continua V,+-D/ , N2+D,N;+D3, 

- +» , has the property that each continuum contains the one following it in 
the sequence. There is therefore a continuum K’ common to the members 
of the sequence. Evidently K’ contains P’ and P, but contains no points 
of any of the sets D?’. 

Also there is a continuum K’’ common to all the members of the sequence 
Ni+Dj’, N2+Dzi', Ns+D3’, - - -, and K” contains P” and P, but no points 
of any of the sets D/. Therefore the only points that K’ and K” can have in 
common are those points which are common to all members of the sequence 
Ni, No, Ns, - - + , and the point P is the only point common to all the sets 
N;. Therefore K’ and K” have only P in common. 

The set K’ can also be expressed as the point P plus the sum of the in- 
finite collection of connected sets Dj, Dj, Dj, - - - , each of which is con- 
tained in all that follow it in the sequence. It therefore follows that P is 
a ron-cut point of K’, and that therefore the point P, considered as a point 
of the continuum K”, is a limit point of the connected subset K’—P of 
M-—K". That is, P fails to have property 6, which is contrary to hypothesis. 


THEOREM 10. Jf a point P of a bounded continuum M has property 5, it has 
property 4. 

Suppose P has property 5, but not property 4. Then M contains a sub- 
continuum J, such that N —P is disconnected. In that case, NV can be ex- 
pressed as the sum of two continua J and N;z having only P in common. 
Let P; be a point of N,(t=1, 2), and let X; be a subcontinuum of NV; which 
is irreducible between P; and P. The continuum K+; is irreducible be- 
tween P; and P», and therefore P does not have property 5, which is contrary 
to hypothesis. 


| 


1928] END POINTS OF CONTINUA 7 


THEOREM 11. Jf a point P of a bounded continuum M has both property 
5 and property 9, it has property 6. 


We shall show that if P has properties 5 and 9, and if N is any sub- 
continuum of M containing P, then P is not a limit point of the set M—N, 
and therefore P has property 6. Suppose then that P has properties 5 and 9, 
but that for some subcontinuum WN of M, the point P is a limit point of 
M-—WN. We shall show that this leads to a contradiction. 

Let P> be a point of N whose distance from FP is less than the distance 
x (of property 9). Then N contains a subcontinuum N,j irreducible between 
P, and P, and P is also a limit point of the set M—No. If P’ is any point 
of M—WN>, whose distance from P is less than x, every subcontinuum of M 
that contains P’ and P must contain the continuum Np, otherwise P fails to 
have property 9. 

Let P; be a point of M— WN» whose distance from P is less than x, and 
let N, be a subcontinuum of M irreducible between P; and P. We have just 
shown above that NV, is a subset of N;. Since N» contains P, the set Ni— No 
is connected. 

If N, contains all points of M— No which are at a distance from P less 
than some constant k then if we add to N,—WN,p the set of its limit points 
(which includes P), the resulting continuum must contain No, as we have 
shown above. In other words, No is a continuum of condensation of N. 
But in that case N, is irreducible between P; and any point of No, and P 
fails to have property 5. Therefore there are points of M— WN, arbitrarily 
close to P which are not points of Ni, that is, P is a limit point of the set 
M—N,. 

Therefore let us select a point P, of M—WN, whose distance from P is 
less than x/2. If N2 is any subcontinuum of M irreducible between P, and P, 
then V2 contains N, and P is a limit point of the set M—N»2. In general, if 
(for 7=2, 3,4, ---) Pi isa point of whose distance from P is less 
than x/i, then any subcontinuum WN; of M which is irreducible between P; 
and P, contains N;_:, and P is a limit point of the set M — Nj. 

Let us then select a definite sequence of points Po, P:, P2,---, anda 
definite sequence of continua No, Ni, Ne, - - - having the properties des- 
cribed above. We shall show that if K denotes the continuum consisting 
of Not+Ni+Ne2+ --- plus limit points, then every proper subcontinuum 
of K is a continuum of condensation of K and therefore K is indecomposable.* 


*A continuum is said to be indecomposable if it cannot be expressed as the sum of two of its 
proper subcontinua. 


i 

| 

‘ 

i 


76 H. M. GEHMAN [January 


Suppose some proper subcontinuum L of K is not a continuum of con- 
densation of K. Evidently the points of Z which are not limit points of K—L 
do not lie entirely in the set of limit points of No+Ni+N2+ - - - , and there- 
fore L must contain a point Q of one of the sets No, Nu, No, ---, say N;, 
such that Q is not a limit point of K—L. 

Suppose L does not contain P. The closed set H consisting of (K —L) 
plus limit points of (K—L) is a proper subset of K, and consists of a col- 
lection (G) of maximal continua of H. The continuum G, of this collection 
that contains P can contain none of the points P;+Pji:+Pji2+ ---. 
The continuum JZ also can contain only a finite number of points of the 
sequence Po, P;, P2, - - - and therefore there exists an integer k such that 
no point of the sequence - is a point of L or of G;. Let P,, 
and P, be two points such that m=k, n=k, and m2=j,n2j. If P,, and P, 
were in different continua of (G), then by adding to L+G, each of these 
continua in turn, we obtain two continua in which we can construct sub- 
continua of M irreducible between P,, and P, and between P, and P re- 
spectively, neither of which contains the other, which is contrary to the 
hypothesis that P has property 9. Therefore all points of Pj,:, Pj+x+1, 
P are points of the same maximal continuum of H. Since P is a 
limit point of the set it follows that P also 
belongs to the maximal continuum of H that contains this set. But this is 
impossible, as the continuum G; containing P contains no points of the set 

Suppose L contains P. If L contains any point P, of the set Pp +Pi+P: 
+ .---, then Z must contain all the points Pp>+Pi+ ---+/P,. If there 
were no integer k such that P; is a point of K—L then L would be identical 
with K, which is contrary to our supposition that L is a proper subcontinuum 
of K. Therefore there is some integer k, such that Px, Piss, Piss, - - - are 
all points of K—JL. As in the preceding paragraph, all the points Pi, Pi4:, 
Pis2, - - - lie in the same continuum of (G), and this continuum also contains 
P. But this is impossible, as the continuum containing P cannot contain 
any points of the sequence P;+Pj1+Pji2+ ---. Therefore K is inde- 
composable. 

Any indecomposable continuum K has this property: given any point P 
there are two other points such that K is irreducible between any two points 
of the three. It follows that if M contains an indecomposable continuum K 
containing P, then P does not have property 5. But this is contrary to 
hypothesis, and therefore Theorem 11 is true. 

Note that the above argument establishes the following theorem: 


| 
i 


1928} END POINTS OF CONTINUA 77 


THEOREM 12. Given a sequence of points Po, Pi, P2, - - - whose sequential 
limit point is P, and a collection of continua No, Ni, No, - - - , such that N; is 
irreducible between P; and P, and such that N; is a proper subset of Nix. 
If the continuum K consisting of Not+Nit+N2+ -- - plus limit points has 
the property that any subcontinuum of K irreducible between P; and P is a 
subset of every subcontinuum of K irreducible between Pj, and P, then K is an 
indecomposable continuum. 


This theorem can be used to establish the indecomposability of two 
examples due to Knaster.* To recapitulate: 


THEOREM 13. For a point of a bounded continuum, 
(1) property 9 is stronger than property 8; 
(2) property 8 is stronger than property 4; 
(3) property 7 is stronger than property 6; 
(4) property 6 is stronger than property 5; 
(5) property 5 is stronger than property 4; 
(6) properties 5 and 9 together are stronger than property 6. 


The following set of examples shows that no logical relations exist among 
properties 4-9, excepting those given in Theorems 6-11, and therefore the 
truth of Theorem 13 follows from the examples and from Theorems 6-11. 

' An end point of a straight line interval has all the properties 4-9; an 
interior point has none of them. The point P of example 1 (of Part 3) has 
properties 4-7, but not 8 or 9. The point P of example 2 has properties 4-8, 
but not 9. 

EXAMPLE 3. Let M consist of the curve y=sin (1/x), for 0<x<1, and 
the straight line interval from (0, —1) to (0, 1). Let P be the point (0, 1), 
which has properties 4, 8, and 9, but not 5, 6, or 7. 

EXAMPLE 4. Let M consist of the continuum of example 3, plus the 
curve y=sin (1/x), for O>x=-—1. Let P be the point (0, 1), which has 
properties 4 and 8, but none of the properties 5-7, 9. 

ExamMPLe 5. Let M consist of the continuum of example 3, plus the set 
of semicircles lying on the negative side of the y-axis, with center at the point 
(0, 1—3/2") and radius equal to 1/2”, for n=1, 2, 3,---. Let P be the 
point (0, 1), which has property 4, but none of the properties 5-9. 

ExamPLeE 6. Let M consist of the continuum of example 2, plus the set 
of curves 


* Contained in C. Kuratowski, Théorie des continus irréductibles entre deux points, Fundamenta 
Mathematicae, vol. 3 (1922), pp. 200-223. These examples are described on pp. 209-210 and on 
pp. 216-217 respectively. 


4 i 

i 


78 H. M. GEHMAN [January 


1 G =) ( 1 ) 
= — — —) sin 


between (1/2") <x<(1/2"-'), form=1,2,3,---. Let P be the point (0, 0), 
which has properties 4 and 5, but none of the properties 6-9. 

ExamMPLe 7. Let M consist of the straight line intervals from (1, 0) to 
(0, 0), and from (1, 0) to (0, 1/m), for m=1, 2, 3,---+. Let P be the point 
(0, 0), which has properties 4, 5, and 8, but none of the properties 6, 7, 9. 

Example 8.* Let M consist of the straight line intervals from (0, 0) to 
(1, 0), from (1/2", 0) to (1/2”, 1/2"), from (1/2", 1/2") to (—1/2", 1/2"), 
from (—1/2”, 1/2”) to (—1/2", —1/2"), from (—1/2”, —1/2”") to (1, —1/2"), 
from (1, —1/2") to (1, —3/2"**), from (1, —3/2"**) to (0, —3/2"**) for 
n=0, 1, 2,---+. Let P be the point (0, 0), which has properties 4, 5, 6, 
and 8, but not property 7 or property 9. 

EXAMPLE 9. Let M consist of the continuum of example 8, plus the set of 
semicircles lying on the positive side of the x-axis, with center at the point 
(3/2”, 0) and radius equal to 1/2", for n=2, 3,4,---. Let P be the point 
(0, 0), which has properties 4, 5, 6, but none of the properties 7, 8, and 9. 

EXAMPLE 10. Let Mp, denote the indecomposable continuum described 
by Knaster in his thesis,t and designated there by K;. This continuum lies 
within the square whose vertices are (0, 0), (1, 0), (1, 1), (1, 0) and is irre- 
ducible between the point (1, 0) and any point which cannot be joined to 
(1, 0) by an arc in Mo. The continuum M, can also be constructed by the 
method used in constructing the second example in Kuratowski’s article. 

Let M,(n=1, 2,3, - - - ) be the set of points (x’, y’) obtained by subjecting 
the points (x, y) of M, to the following transformation: x’ = («+2"+!—2)/2", 
y’=y/2". Let P be the point (2, 0), and let M be the continuum consisting 
of P+M,+M:+M2+ ---. The point P has properties 4, 5, 6, 8, and 9, 
but not property 7. 

EXAMPLE 11. Let M denote an indecomposable continuum every 
subcontinuum of which is indecomposable. An example of such a continuum 
has been given by Knaster,t who designates it by K;. If P is any point of M, 
P has properties 4, 8, and 9, but none of the properties 5-7. This combination 
of properties of a point has already been illustrated in example 3. 


5. CONCERNING PROPERTY 3 FOR A BOUNDED CONTINUUM 


The following theorem follows from the definitions of properties 3 and 6. 


* This example is due to Whyburn, loc. cit., proof of Theorem 30. 

+ B. Knaster, Un continu dont tout sous-continu est indécomposable, Fundamenta Mathematicae, 
vol. 3 (1922), pp. 245-286. This example is described on pp. 269-271. 

t Loc. cit., pp. 275-279. 


| 
d 
: 


1928) END POINTS OF CONTINUA 79 


THEOREM 14. If a point P of a bounded continuum M has property 3, 
it has property 6. 


We have not been able to determine whether or not P must have property 
3 when it has property 6. However, the arguments given to prove Theorems 
8 and 11 also establish the following theorems concerning property 3. 


THEOREM 15. If a point P of a bounded continuum M has property 7, 
it has property 3. 


THEOREM 16. If a point P of a bounded continuum M has property 5 
and property 9, it has property 3. 


We can combine Theorem 13 and the results of this section into the follow- 
ing theorem. 


THEOREM 17. For a point of a bounded continuum 
(1) property 9 is stronger than property 8; 
(2) property 8 is stronger than property 4; 
(3) property 7 is stronger than property 3; 
(4) property 6 is stronger than property 5; 
(5) property 5 is stronger than property 4; 
(6) properties 5 and 9 are stronger than property 3; 
(7) either properties 3 and 6 are equivalent, or property 3 is stronger than proper- 
ty 6. 


In case property 3 is stronger than property 6, the question remains 
whether examples exist of a point having properties 4-6, 8 but not 3, 7, 9, 
and of a point having properties 4-6, but not properties 3, 7-9, or whether 
some further logical relations exist among properties 3-9 for a bounded 
continuum. 


6. SOME THEOREMS CONCERNING POINTS WITH PROPERTIES 1-9 


In the preceding discussion we have assumed the following theorem, 
whose truth follows from the definitions of properties 1-9. 


THEOREM 18. If a point P of a bounded continuum M has property x 
(where x=1, 2,---,9), and N is a subcontinuum of M containing P, then 
P has property x in the continuum N. 


THEOREM 19. The set of points of a bounded continuum which have property 
3 is totally disconnected. 


« 


80 H. M. GEHMAN [January 


If the theorem were not true, there would exist a bounded continuum 
M which contains a connected subset K consisting of more than one point, 
such that every point of K has property 3. Let P be a point of K. 

If K is a proper subset of M, and Q is any point of M—K, any sub- 
continuum WN of M which is irreducible between P and Q contains no points 
of K, except P, because VN —P—Q can contain no points having property 5, 
and therefore none having property 3, by Theorems 14 and 9. Therefore 
M-—(N-—P) contains K, and P fails to have property 3, which is contrary 
to hypothesis. 

If K is identical with M, any subcontinuum W of M which is irreducible 
between any two points P and Q of M, consists entirely of points of K. 
But no point of N —P—Q can have property 5, and therefore none can have 
property 3. Therefore no point of VN —P—Q isa point of K, which is contrary 
to hypothesis. Therefore Theorem 19 is true. 


Corotrary 19a. The set of points of a bounded continuum which have 
property 7, is totally disconnected.* 


Coroxrary 19b. The set of points of a bounded continuum which have 
both properties 5 and 9 is totally disconnected. 


Corotiary 19c. The set of end points of a continuous curve is totally 
disconnected.t 


THEOREM 20. A bounded continuum which is irreducible between two of 
its points, cannot contain more than two points with property 5. 


THEOREM 21. The set of points of a bounded continuum which have property 
5, contains no continuum. 


Corottary 21a. The set of points of a bounded continuum which have 
property 6, contains no continuum. 


However, the set of points of a bounded continuum which have property 
9 may contain a continuum, as was shown in example 11. Similarly with 
the set of points having property 8, and with the set having property 4. 


THEOREM 22. If M is a bounded continuum, and K is any subset of the 
set of points of M which have property 5, then M —K is strongly connected. 


* Menger, loc. cit., p. 283, Theorem V. 

t Whyburn, loc. cit., Theorem 21. See footnote on p. 391. We have shown in Part 2 of this 
paper that an end point in Whyburn’s sense is equivalent to an end point in Menger’s sense, in the 
case of a continuous curve. 

t In Whyburn’s Theorem 31 the second paragraph should be omitted, as his “end point” has 
property 5, and therefore P cannot be any point other than A or B. 


| 

| 

| 

| 

| 


1928] END POINTS OF CONTINUA 81 


If this is not true, then M—K contains two points P and Q such that 
every subcontinuum of M containing P and Q contains a point of K. In 
particular, any subcontinuum W of M irreducible between P and Q contains 
a point of K. But no point of N—P—Q can have property 5, and therefore 
no point V —P—(Q is a point of K, which is a contradiction. 


CorOLLARY 22a. If M is a continuous curve, and K is any subset of the 
end points of M, then M —K is strongly connected.* 


By example 11, we see that Theorem 22 is not true if property 5 is 
replaced by any of the properties 4, 8, or 9. In fact, in that case M —K may 
be disconnected. The theorem remains true, of course, if property 5 is re- 
placed by any of the properties 3, 6, or 7. 


THEOREM 23. A necessary and sufficient condition that a bounded continuum 
M be an acyclic continuous curve, is that every non-cut point have property 5. 


The necessity of the condition follows from the fact that every non-cut 
point of an acyclic continuous curve is an end point,f and therefore has 
property 5. 

The condition is sufficient, because if every non-cut point of a bounded 
continuum M has property 5, the set of non-cut points of M is identical 
with the set of points of M which have property 5. If K is any subcontinuum 
of M, then K contains a subcontinuum WN which is irreducible between two 
points X and Y, and therefore no point of N—X-—Y has property 5. In 
other words, every point of N—X-—Y is a cut point of M. But if every sub- 
continuum K of M contains uncountably many cut points of M, then M 
is an acyclic continuous curve.f{ 

The sufficiency of the condition can also be established by Theorem 22 
and Whyburn’s Theorem 32. 

Note that the condition remains necessary and sufficient if we replace 
property 5 by any of the stronger conditions 3, 6, or 7. In case property 5 is 
replaced by property 4 or property 8 the condition is necessary, but not 
sufficient, as Example 11 shows. If property 5 is replaced by property 9, 
the condition is neither necessary nor sufficient. 


THEOREM 24. If a point P of a bounded continuum M has property 7, 
then P is a limit point of cut points of M. 


* W. L. Ayres, loc. cit., Theorem 6, p. 401. 

t Wilder, loc. cit., Theorem 7, p. 358. 

1 R. L. Moore, Concerning the cut-points of continuous curves, etc., Proceedings of the National 
Academy of Sciences, vol. 9 (1923), pp. 101-106. See Theorem E, p. 103. 


ff 


82 H. M. GEHMAN [January 


Given any positive number e, there is a cut point of M whose distance 
from P is less than ¢ and therefore P is a limit point of cut points of M. 
Since a point having any of the properties 3-9 is a non-cut point of M, 
a point having property 7 is a non-cut point of M which is a limit point of 
cut points. However, the converse is not true, as a point P may be a non-cut 
point and a limit point of cut points, and still not have any of the properties 
3-9, even if M is a continuous curve. An example which shows this, is a 
continuous curve M consisting of the continuum of example 2, plus the 
straight line intervals from (0, 0) to (0, 1), from (0, 1) to (1, 1), and from 
(1, 1) to (1, 0), where P is the point (0, 0). 

The above theorem is not true if we replace property 7 by any or all 
of the properties 3-6, 8-9, as is shown by example 10, where P has all 3-6, 
8-9, but is not a limit point of cut points of M. 


Corotiary 24a. If P is an end point of a continuous curve M, then P is 
a limit point of cut points of M. 


We have already proved Corollary 24a incidentally in the proof of The- 
orem 1, where we used this fact to establish that if P has property 1, it has 


property 7. 
THEOREM 25. If a point P of a bounded continuum M has property 7, 
then M is connected im kleinen at P. 


Given any positive number e, there is a domain D containing P of di- 
ameter less than e, and such that the set of points of M in D plus its boundary 
is a continuum WN. If 6 is any positive number which is less than the distance 
from P to any point of the exterior of D, then any two points of M at a dis- 
tance less than 6 from P are points of NV, and since N is of diameter less than 
€, it follows that M is connected im kleinen at P. 


THEOREM 26. If a point P of a bounded continuum M has property 5 
and property 9, then M is connected im kleinen at P. 


In the proof of Theorem 11, we showed that if P has properties 5 and 9, 
and if N is any subcontinuum of M containing P, then P is not a limit point 
of the set M—N. Given any positive number e, let N be a subcontinuum of 
M of diameter less than ¢ and containing P, and let 6 be less than the distance 
from P to any point of M—N. Then as in the proof of Theorem 25, it follows 
that M is connected im kleinen at P. 

These two theorems might be combined into the single theorem that M 
is connected im kleinen at P if M has properties 3-7, or 3-6, 8, and 9. Example 
11 shows that M is not necessarily connected im kleinen at P, if P has proper- 


4 


1928] END POINTS OF CONTINUA 83 


ties 4, 8 and 9, and example 8 shows that P may have properties 3-6, and 8, 
and still M need not be connected im kleinen at P. Therefore the hypothesis 
of neither of the above theorems can be weakened. 

Whyburn* has shown that if a point P of a bounded continuum M has 
property 6 and is accessible from some point of a domain complementary to 
M, then M is connected im kleinen at P. This naturally raises the question 
as to the conditions under which a point P having certain of the given proper- 
ties is accessible from a domain complementary to M. 

If M is a continuous curve, every point of M (whether it be an end point 
or not) is accessible from every complementary domain on whose boundary 
the point lies. However, a point may be anend point of a continuous curve 
and also have property 8, and still not be on the boundary of any comple- 
mentary domain. An example which shows this, is the continuous curve M 
consisting of the straight line interval from (0, 0) to (1, 0), and the set of 
circles x?+-y?=1/n?, for n=1, 2,3, ---. The point P=(0, 0) has properties 
1-8, but is not on the boundary of any complementary domain. 

However, if an end point of a continuous curve has property 9, it is a 
boundary point of a complementary domain, by Theorem 4. 

If M is a bounded continuum (not necessarily a continuous curve), a 
point P may have all the properties 3-9, and still not be a boundary point 
of a complementary domain. An example which shows this, is the con- 


tinuum consisting of the point (0, 0), the set of circles x?+y?=1/4", and the 
two sets of curves whose equations in polar coérdinates are 
for 020, and r=(3—6/(@+1))/2"*?, for @=0, and 
for n=0,1,2,--.+-. The point P=(0, 0) has properties 3-9, but is not on 
the boundary of any domain complementary to M. Therefore, in the two 
following theorems, it is necessary to assume that P is on the boundary of a 
complementary domain. 


THEOREM 27. If a point P of a bounded continuum M has property 7 and 
is on the boundary of a domain D complementary to M, then P is accessible from 
D. 


Since P has property 7, if C is any circle about P as center, there exists 
a domain £ interior to C, containing P, whose exterior contains points of D, 
and whose boundary F has just one point X in common with M. The domain 
E can also be selected so that F—X is connected.| The domain E contains 


* Loc. cit., Theorem 30, p. 395. 
7 R. L. Moore, Concerning the sum of a countable number of mutually exclusive continua in the 
plane, Fundamenta Mathematicae, vol. 6 (1924), p. 191, Theorem 3. 


84 H. M. GEHMAN 


points of D, and since the exterior of E also contains points of D, it follows 
that all points of F—X are points of D. If C, is any circle about P as center, 
and interior to Z, any two points P, and P; of D interior to C, can be joined 
by a connected subset of D interior to C,—the connected subset consisting 
of F—X plus the points in E of any arc joining P; to P; in D. Hence D is 
“connected near P” and hence* P is accessible from D. 


THEOREM 28. If a point P of a bounded continuum M has property 5 
and property 9, and is on the boundary of a domain D complementary to M, 
then P is accessible from D. 


If B is the boundary of D, then P has properties 3-6, 8-9 in the con- 
tinuum B, by Theorem 18. Therefore B—FP is connected and B contains no 
continuum of condensation containing P, and therefore D is connected 
near P, and P is accessible from D.f 

Again, we might combine the above two theorems into the single theorem 
that P is accessible from a complementary domain D, if P is on the boundary 
of D, and has either properties 3-7, or properties 3-6, 8-9. Examples 8 and 11 
show that neither hypothesis can be weakened and the theorem remain true, 
as these examples show that a point may have properties 4, 8, and 9 or 
properties 3-6, and 8, and be on the boundary of a complementary domain 
of M, and still not be accessible from that domain. 

In conclusion, I wish to express my sincere appreciation to Professor 
R. L. Moore for his inspiring assistance in the preparation of all the papers 
written during my year as National Research Fellow. 


*R. G. Lubben, Concerning connectedness near a point set. 
¢ R. G. Lubben, loc. cit. 


THE UNIVERSITY oF TEXAS, 
Austin, TEXAS 


| 


THE APPORTIONMENT OF REPRESENTATIVES 
IN CONGRESS* 


BY 
E. V. HUNTINGTON 


INTRODUCTION 


In the absence of any provision for fractional representation in Congress, 
the constitutional requirement that the number of representatives of each 
state shall be proportional to the population of that state cannot be carried 
out exactly; some deviation from strict proportionality is unavoidable, on 
account of the necessary adjustment of fractions. 

Thus, between any two states, there will practically always be a certain 
inequality which gives one of the states a slight advantage over the other. 
A transfer of one representative from the more favored state to the less 
favored state will ordinarily reverse the sign of this inequality, so that the 
more favored state now becomes the less favored, and vice versa. Whether 
such a transfer should be made or not depends on whether the amount of 
inequality between the two states after the transfer is less or greater than 
it was before; if the “amount of inequality” is reduced by the transfer, it is 
obvious that the transfer should be made. 

The fundamental question therefore at once presents itself, as to how 
the “amount of inequality” between two states is to be measured. This is a 
mathematical question of quite unexpected complexity, which has been 
discussed on a scientific basis only within the last few yéars. The best 
solution of the problem appears to be the Method of Equal Proportions, which 
it is the purpose of the present paper to explain. 

* Presented to the Society, December 28, 1920, February 26, April 23, September 8, and De- 
cember 28, 1921, and February 25, 1922; with the subsequent addition of a number of new examples 
and tables; and read, in part, before the Mathematical Association of America, December 31, 
1926; received by the editors in January, 1927. 

t See E. V. Huntington, A new method of apportionment of representatives, Quarterly Publication 
of the American Statistica] Association, September, 1921, pp. 859-870; also the Report upon the 
Apportionment of Representatives, prepared by the Joint Committee of the American Statistical 
Association and the American Economic Association to Advise the Director of the Census, and pub- 
lished in the same journal, December, 1921, pp. 1004-1013. This Report, which pronounces in favor 
of the Method of Equal Proportions, is reprinted in full in Hon. E. W. Gibson’s Remarks in the 
Congressional Record for April 7, 1926, pp. 6840-6842. The Method of Equal Proportions was 
incorporated in the Bill (H.R. 17378) introduced by Mr. Fenn in the House of Representatives, 
March 2, 1927; see the Report of Hearings held in January and February, 1927, before the House 
Committee on the Census (69th Congress, 2d Session), and the Congressional Record for March 2, 
1927, pp. 5323-5331. 

85 


86 E. V. HUNTINGTON [January 


A FIRST BASIS FOR THE METHOD OF EQUAL PROPORTIONS 


The first measure of the “amount of inequality” between two states, 
which suggests itself, is based on the size of the “congressional district,” 
that is, the result of dividing the population of the staté by the number of 
its representatives. 

For example, if the population of a certain state A is A = 1,000,000, and the 
number of its representatives is a=4, then the size of a congressional district 
in that state will be A/a =250,000. If the population of a second state B is B 
and the number of its representatives is b, then the size of the congressional 
district in the second state is B/b. 

Now in a perfect apportionment, these two numbers would be exactly 
equal: 

A/a = B/b; 


hence, in any practical case, the inequality between these two numbers— 
that is, the inequality between the two congressional districts, A/a and B/b— 
may be taken as a measure of the “amount of inequality” between the two 
states A and B. If this inequality can be reduced by a transfer of a repre- 
sentative from one state to the other, then, according to this first criterion, 
the transfer should be made. 

The rather vague concept of the inequality between two states is thus 
reduced to the more definite concept of the inequality between two numbers. 

The question then comes down to this: what shall be meant by the 
inequality between these two numbers? Shall we mean the absolute dif- 
ference between the two numbers, or the relative difference between them? 
If the size of the congressional districts is large, say 250,000 in one state and 
250,005 in the other, then the difference of five people is of little consequence 
in so large a number. But if the districts were themselves very small, say 
10 and 15, then the same difference of five people becomes important; 15, 
we say, is larger than 10 by fifty per cent, while 250,005 is larger than 
250,000 by only (1/500)th of 1 per cent. 

In the present problem it is clearly the relative or percentage difference, 
rather than the mere absolute difference, which is significant.* Our first 
criterion for a good apportionment may therefore be precisely formulated 
as follows: 


* The relative or percentage difference between two numbers is here thought of as the absolute 
difference divided by the smaller number. For the present purpose it might equally well be thought 
of as the absolute difference divided by the larger number, or the absolute difference divided by the 
(arithmetic, geometric, or harmonic) mean between the two numbers. 


| 


1928) APPORTIONMENT OF REPRESENTATIVES 87 


Test 1. If the relative difference between the congressional districts, A/a 
and B/b, belonging to any two states can be reduced by a transfer of a representa- 
tive from one state to the other, then this transfer should be made. 


One further question remains. It is not obvious that a transfer which 
improves the situation between one pair of states, A and B, may not make 
the situation worse between one of these states and some other state; in 
other words, it is not obvious that the test can be applied to all pairs of 
states simultaneously. 

It will be shown below, however, that this is a “workable” test; that is, 
by successive applications of the test, it is always possible to arrive at a final 
apportionment which cannot be “improved” by any further transfer between 
any two states. 

The only known method of apportionment which satisfies Test 1 proves 
to be the Method of Equal Proportions. 


A SECOND BASIS FOR THE METHOD OF EQUAL PROPORTIONS 


A second, and equally obvious, method of defining the “amount of in- 
equality” between two states is based, not on the ratio A/a, but on the ratio 
a/A (that is, the number of representatives divided by the population of 
the state). This number a/A is a small fraction which can be interpreted 
as the individual share of a representative which each inhabitant in the given 
state may be said to control. 

For example, if the number of inhabitants in a given state is A = 1,000,000, 
and the number of representatives is a=4, then the “individual share” of a 
representative which each inhabitant of that state can claim is a/A 
= 1/250,000 =0.000004. If the population of a second state is B and the 
number of its representatives is }, then the “individual share” in the second 
state is b/B. 

Now in a perfect apportionment, these two numbers would be exactly 
equal: 

a/A = b/B; 


hence, in any practical case, the inequality between these two numbers— 
that is, the inequality between the individual shares, a/A and b/B—may be 
taken as the measure of the “amount of inequality” between the two states 
A and B; and here, as before, it is clearly the relative or percentage difference, 
rather than the mere absolute difference, which is significant. 

Our second criterion for a good apportionment may therefore be precisely 
formulated as follows: 


= / 


* 


88 E. V. HUNTINGTON [January 


Test 2. If the relative difference between the two “individual shares,” a/A 
and b/B, belonging to any two states, can be reduced by a transfer of a repre- 
sentative from one state to the other, then this transfer should be made. 


Here again it will be shown that this is a “workable” ‘test; and the only 
known method of apportionment which satisfies this Test 2 is the same 
Method of Equal Proportions which also satisfies Test 1. 


WORKING RULE FOR THE METHOD OF EQUAL PROPORTIONS 


We now turn to a purely technical question, of little interest except 
to the computers in the Bureau of the Census. 

Given, the populations of the several states; and given, the size of the 
House, that is, the total number of representatives to be assigned; how shall 
we actually compute an apportionment which will satisfy Test 1 and Test 2? 
The practical working rule for the computation is as follows: 

First, assign one representative to each state (here 48 in number). 

Next, for each state, make out a series of cards, each card containing: 
(1) the name of the state; (2) a serial number, &, starting with 2 and running 
up to a number somewhat greater than the number of representatives that 
that state is expected to receive; and (3) a “rank 


index,” found by multiplying the population of the Dethed EP 
state by a certain “multiplier,” given, for each 
serial number, in the adjoining table. No. Multiplier 


Then combine all these series of cards into a single ia pe 
series, arranged in order of the “rank indices,” from 1/{(2 + 3)}u2 
the highest to the lowest, thus forming what may be 4 1/[(3 - 4)}¥2 
called a “priority list,” for the given populations, en 
and any size of House.* 

Finally, assign additional representatives (after the first) to the several 
states in the order in which the cards occur in this “priority list,” continuing 
the assignment as far as may be necessary to fill up a House of any desired 
size. 

An apportionment worked out according to this rule will always satisfy 
Test 1 and Test 2, as will be shown below. In practice, it may be found 
convenient to number the cards of the “priority list” consecutively, in red 
ink, beginning with the number (here 49) one greater than the number of 
states, and continuing until any desired total number of representatives 


w 


* In case two cards bear the same index number, the state having the larger population may be 
given priority. This case of a “tie” will be extremely rare, however, on account of the irrationality of 


the “multipliers” (see a later paragraph). 


¢ 
} 
: 


1928] APPORTIONMENT OF REPRESENTATIVES 89 


(say 435) has been reached. For most purposes, however, the earlier part 
of the list may be omitted. 
In the following table the multipliers are given to seven decimal places. 


Table of Multipliers (Method EP) 


Z 
9 


Multiplier " Multiplier Jo. Multiplier No. Multiplier 


-074 1249 -037 7426 -025 3185 — 
-069 0066 -036 3696 -024 6932 
° -064 5497 -035 0931 -024 0981 
-288 6751 -060 6339 -033 9032 J -023 5310 
-223 6068 -057 1662 -032 7913 -022 9900 
-182 5742 -054 0738 -031 7500+ -022 4733 
-154 3033 -051 2989 -030 7729 -021 9793 
-133 6306 -048 7950+ -029 8541 -021 5066 
-117 8511 -046 5242 -028 9886 -021 0538 
-105 4093 -044 4554 -028 1718 -020 6197 
-095 3463 -042 5628 -027 3998 -020 203i 
-087 0388 -040 8248 -026 6690 -019 8030 
-080 0641 -039 2232 -025 9762 


In this table, if =the serial number, the “multiplier” =1/[(#—1)k]*/?. 
The entries in the table may be verified, with a computing machine, by the 
process of squaring and taking reciprocals, without extracting square roots. 

An illustration of the use which may be made of the table of multipliers, 
even before the priority list is completed, is the following: Any state A will 
receive its 43d representative before another State B receives its 8th repre- 
sentative, provided the population of State A multiplied by 0.0235310 is 
greater than the population of State B multiplied by 0.1336306. 

Since the typical multiplier, 1/[x(x+1)]'/?, is the reciprocal of the 
geometric mean between the successive integers x and x+1, the Method of 
Equal Proportions might be called also the Method of the Geometric Mean.* 

The proof of the correctness of the rule for Method EP is as follows. 

Suppose that, in an apportionment made according to the rule, any State 
A has received x+1 representatives and any other State B has received y 
representatives; and suppose (as we may, without loss of generality) that 


* The first use of the geometric mean in connection with this problem occurs in the Method of 
Alternate Ratios, proposed by Dr. J. A. Hill in 1910; this method, though obtained through entirely 
different reasoning (Tests 1 and 2 being unknown at that time), differs from the Method of Equal 
Proportions only in the fact that it insists on too close a relationship between the assignment given 
to any state and the true quota of that state. (This defect leads to an “Alabama paradox,” as we 
shall see below.) Dr. Hill was also the first writer to recognize the superiority of the relative difference 
over the absolute difference, in the solution of this problem. See his paper in House of Representatives 
Report No. 12, of the Sixty-Second Congress, First Session, April 25, 1911. 


| 

10 

11 

12 
| 13 


90 E. V. HUNTINGTON [January 


State A is over-represented in comparison with State B. We proceed to 
show that if one representative is transferred from State A to State B, the 
“inequality” between the two states (measured according to Test 1 or Test 2) 
will be thereby increased. 4 

We begin by showing that in the hypothetical apportionment, in which 
State A has x representatives and State B has y+1, the latter state will be 
over-represented in comparison with the former. 

From the way in which the “priority list” is constructed we know that 
A?*/|x(x+1)]>B?/[y(y+1)]; and since in the actual assignment A is over- 
represented in comparison with B, we know that B/y>A/(x+1), and hence 
(x+1)/A>vy/B. It follows that B/(y+1)<A/x, and hence x/A <(y+1)/B, 
since the contrary assumption would lead to contradiction. But these last 
relations express the fact that after the transfer is made, State B is over- 
represented in comparison with State A. 

. We can now write down the expression for the “inequality” between the 
two states before and after the transfer, remembering the convention that the 
“percentage difference” between any two numbers is understood to mean the 
“absolute difference divided by the smaller number.” 

In the actual assignment (before the transfer), the inequality in question 
is, by Test 1, [B/y—A/(x+1) ]/[A/(x+1)], or, by Test 2, [((x+1)/A —y/B] 
/\y/B], each of which reduces to B(x+1)/(Ay) —1. 

In the hypothetical assignment (after the transfer), the inequality is, 
by Test 1, [A/x—B/(y+1)]/[B/(y+1)], or, by Test 2, [(y+1)/B—x/A] 
/|x/A], each of which reduces to A(y+1)/(Bx) —1. 

But from the given relation A?[x(x+1)]>B?/[y(y+1)], we have at 
once A*(y+1)/x>B?(x+1)/y, whence 


A(y + 1)/(Bx) —1> Bix + 1)/(Ay) 1, 


which shows that the inequality between the two states would be increased 
by the transfer. 

In other words, an apportionment made according to the rule, is one 
which cannot be “improved” (in the sense of Test 1 or Test 2), by any 
transfer of a representative from any state to any other state. 


CRITIQUE OF TWO CONFLICTING METHODS 


As pointed out above, whichever definition of the amount of inequality 
between two states may be adopted, it is clearly the relative or percentage 
difference, rather than the mere absolute difference, which is significant. 
The inappropriateness of the absolute difference is made still more apparent 
by the fact that its use leads us to two conflicting methods of apportionment. 


¥ 
i 
| 


1928] APPORTIONMENT OF REPRESENTATIVES 91 


Thus, if we substitute the “absolute difference” for the “relative dif- 
ference” in Tests 1 and 2, we have the following tests: 


Test 1a (not recommended). If the absolute difference between the two 
congressional districts, A/a and B/b, can be reduced by a transfer of a repre- 
sentative from one state to the other, then this transfer should be made. 


This test leads to a distinct method of apportionment, known as the 
Method of the Harmonic Mean (HM).* 


Test 2a (not recommended). If the absolute difference between the two 
“individual shares,” a/A and b/B, can be reduced by a transfer of a representative 
from one state to the other, then this transfer should be made (except that no state 
shall be left without at least one representative). 


This test leads to another distinct method of apportionment, known as 
the Method of Major Fractions (MF). 

Thus, while each of these tests is a “workable” test, each leads to a 
distinct method of apportionment. In comparison with the Method of 
Equal Proportions, the Method of the Harmonic Mean favors the small 
states unduly, while the Method of Major Fractions favors the large states 
unduly. This is illustrated in Example 1, showing the apportionment of 16 
representatives among three states with a total population of 1600. 

In this example, it is fairly obvious that State A should have at least 7 
representatives, State B at least 5, and State C at least 3. But this makes 
only 15 in all. Where shall the 16th representative be assigned? Method HM 
gives it to the smallest state (C), and Method MF gives it to the largest 
state (A); while the Method of Equal Proportions gives it to the middle- 
sized state (B). 

The computations in the right-hand part of the table will be self- 
explanatory. Method HM differs from Method EP only in regard to States 
B and C; the “amount of inequality” between these two states is smaller in 


* See E. V. Huntington, in the paper already cited; or a brief abstract entitled The mathematical 
theory of the apportionment of representatives,in the Proceedings of the National Academy of Sciences, 
vol. 7, pp. 123-127, April, 1921. 

{ The Method of Major Fractions was devised by Professor W. F. Willcox in 1910, and was used 
in the apportionment for that year. See his paper in House of Representatives Report No. 12, of the 
Sixty-Second Congress, First Session, April 25, 1911, and his presidential address as president of 
the American Economic Association, published in the American Economic Review, vol. 6, no. 1, 
Supplement, pp. 1-16, March, 1916; also F. W. Owens, On the apportionment of representatives, 
Quarterly Publication of the American Statistical Association, December, 1921, pp. 958-968. The 
Report of the Advisory Committee cited above, concluded, after elaborate hearings, that the Method 
of Major Fractions was less desirable than the Method of Equal Proportions. 


92 E. V. HUNTINGTON [January 


Example 1 Tests la and 1 Tests 2 and 2a 

Assignment of Size of Congr. Dist. Size of Indiv. Share 
Reps. 
State Pop. 
HM EP MF HM EP EP MF 
A 729 7 7 8 0.00960 0.01097 
B 534 5 6 5 106.80 89.00 0.01124 0.00936 
Cc 337 4 3 3 84.25 112.33 
1600 16 16 16 

Absolute Difference 22.55 23.33 0.00164 0.00161 

Relative Difference 0.268 0.262 0.170 0.172 


column HM when Test 1a is used, and smaller in column EP when Test 1 is 
used. Similarly, Method MF differs from Method EP only in regard to 
States A and B; the “amount of inequality” between these two states is 
smaller in column MF when Test 2a is used, and smaller in column EP when 
Test 2 is used. 

Thus Tests 1a and 2a lead to conflicting results (Methods HM and MF); 
if these were the only tests available, it would be difficult to make a choice 
between them on any but arbitrary grounds. 

On the other hand, Tests 1 and 2 lead to no such dilemma, since the 
Method of Equal Proportions satisfies them both. This fact strengthens our 
belief that in defining a measure of inequality between two states, the 
relative difference is more natural and useful than the absolute difference.* 
The two conflicting Methods, HM and MF, may be regarded as on the same 
level of merit, as between themselves; but both of them are inferior to the 
Method of Equal Proportions. 


WORKING RULE FOR METHOD OF HARMONIC MEAN 


The working rule for the Method of the Harmonic Mean is the same as the 
rule for the Method of Equal Proportions, if, in forming the “priority list,” 
we replace the series of multipliers, there given, by the following: 

1 1 1 
? 
2(1 - 2)/(1 + 2) 2(2 - 3)/(2 + 3) 2(3 - 4)/(3 + 4) 


* See, however, Test 33, below, which uses either the absolute or the relative difference at pleas- 
ure. 


¢ 
| 
4 


1928] APPORTIONMENT OF REPRESENTATIVES 93 


The name Method of the Harmonic Mean is suggested by the fact that 
the typical multiplier, 1/[2(x)(x+1)/(x+x+1)], is the reciprocal of the 
harmonic mean between the successive integers x and x+1. 

The proof that this rule will result in an apportionment satisfying Test 1a 
is as follows: Suppose, as before, that State A has x+1 representatives, and 
State B has y representatives, and that State A is over-represented in 
comparison with State B. Then the inequality between these two states, 
measured according to Test 1a, is B/y—A/(x+1). If, on the other hand, 
State A had only x representatives and State B had y+1, then the inequality 
between the two states would be A/x—B/(y+1). From the way in which 
the priority list is constructed, we know that A(x+x+1)/[2(x)(x+1)] 
>B(y+y+1)/[20)+1)], whence A/x+A/(x+1)>B/y+B/(y+1), 
whence A/x—B/(y+1)>B/y—A/(x+1). 


WORKING RULE FOR THE METHOD OF MAjor FRACTIONS 


The working rule for the Method of Major Fractions is the same as the 
rule for the Method of Equal Proportions, with the replacement of the series 
of multipliers, there given, by the following: 


1 1 1 
244 345 


Since the typical multiplier, 1/(x+4), is the reciprocal of the arithmetic 
mean between the successive integers x and x+1, the method might be 
called the Method of the Arithmetic Mean. The name “Method of Major 
Fractions,” which is now well established, is due to Professor W. F. Willcox, 
who, approaching the subject from an entirely different point of view, 
devised the working rule, as a practical method of computation, in 1910, 
at a time when none of the theoretical tests (1, 2, 1a, 2a) were known. The 
Method of Major Fractions satisfies none of these tests except Test 2a. 

The proof that this rule will result in an apportionment satisfying Test 
2a is as follows: Suppose, as before, that State A has x+1 representatives, 
and State B has y representatives, and that State A is over-represented in 
comparison with State B. Then the inequality between those two states, 
measured by Test 2a, is (x+1)/A—y/B. If, on the other hand, State A had x 
and State B had y+1, then the inequality between the two states would be 
(y+1)/B—x/A. Now, from the way in which the priority list is constructed, 
A/(x+4) is greater than B/(y+4); hence (2y+1)/B>(2x+1)/A, whence 
(y+1)/B—x/A>(x+1)/A —y/B, which was to be proved. 


94 E. V. HUNTINGTON [January 


REMARKS ON THE NAME “METHOD OF MAJOR FRACTIONS” 
AND THE “EXACT QUOTA” OF A STATE 


As a question of practical politics, the controversy at the present time is 
chiefly between the Method of Equal Proportions (EP) and the Method of 
Major Fractions (MF). 

To avoid any possible misinterpretation of the name “Method of Major 
Fractions,” the following remarks are here inserted. 

In a theoretically perfect apportionment, the exact quota of any state A 
is A(R/P), where A is the population of the state, R is the total number of 
representatives in the House, and P is the total population of the country. 
If the exact quotas of all the states came out as whole numbers, the problem of 
apportionment would be solved without further ado. But in practically all 
cases, the exact quota will not be a whole number, and the actual assignment 
must be greater or less than the quota. 

Now it is a common misconception that in a good apportionment the 
actual assignment should not differ from the exact quota by more than one 
whole unit; for example, if the exact quota is 5.21 or 5.76, then it is often 
assumed that the actual assignment should not be less than 5 nor more than 6. 

It is a further misconception that if the exact quota is, say, 5 and a 
fraction, then if the fraction is less than 1/2 it should be disregarded, but if 
it is greater than 1/2, it should add one to the assignment. For example, 
it is often assumed that if the quota is 5.21, the assignment should be 5; 
and if the quota is 5.76, the assignment should be 6. 


As a matter of fact, however, neither of these principles is a workable test 
of a good apportionment, and the Method of Major Fractions, like every other 
known method of apportionment, will often violate both of them. 


Thus, in Example 2, both the Method of Major Fractions and the Method 
EP assign only 90 representatives to State A, although the exact quota of 
that state is 92.15. 

Again, in Example 3, both methods assign 90 representatives to State A, 
although the exact quota of that state is only 87.85. 

Further, in Example 4, the true quota of State A is 9.87; but both 
methods give State A only 9 representatives, in spite of the fact that the 
fraction 0.87 is very much greater than 1/2, and is, in fact, the largest of 
the three fractions which occur in this example. 

Again, in Example 5, the true quota of State A is 7.31; but both methods 
give this state 8 representatives, in spite of the fact that the fraction 0.31 


i 

4 


1928] APPORTIONMENT OF REPRESENTATIVES 95 


is less than 1/2, and is, in fact, the smallest of the three fractions that occur 
in this example. 

Although crucial examples of this sort are not easy to construct, the 
existence of these examples is sufficient to show that the “Method of Major 
Fractions” does not imply that a “major fraction” in the quota of a state will 
always entitle that state to an additional representative, or that a “minor fraction” 
is always to be disregarded. 

As a matter of fact, the size of the quota of an individual state, taken by 
itself, does not determine the number of representatives to which that 
state is entitled. For instance, in Examples 5 and 5a, the quota of State B 
is the same in both cases (5.35); and yet (according to either Method MF 
or Method EP), the number of representatives assigned to this state is 5 
in one case and 6 in the other. This variation in the assignment given to 
State B is due not to any change in State B itself, but to a slight shift of 
population between the other two states. 


Example 2 Example 4 


Pop. MF Pop. 


9215 987 
159 157 


158 156 
157 


156 1300 
155 2 


10,000 Example 5 


Example 3 


Pop. 


8785 
126 
125 
124 
123 
122 


1600 


Example 5a 


Pop. 


729 
535 
336 


1600 


| 


3 


EP EP 
State MF 
A 9 
B 2 
: 
D 
E 
F 
| EP 
Pe State Pop. | MF 
we A 731 8 
B 535 5 
= 334 3 
90 
| 16 
EP 
121 State MF 
120 
119 A 7 
118 B 6 
117 | Cc 3 
10,000 | | 16 


96 E. V. HUNTINGTON (January 


Again, in Examples 6 and 6a, the quota for State B is exactly 44 in each 
case; but the actual assignment (according to either Method MF or Method 
EP) is 43 in one case and 45 in the other. 


Example 6 Example 7 
EP No. of 
State Pop. Quota MF State Pop. Quota Reps 
A 5117 51.17 51 A 1536 15.36 15 
B 4400 44.00 43 B 1535 15.35 15 
Cc 162 1.62 2 Cc 1534 15.34 15 
D 161 1.61 2 D 1533 15.33 15 
E 160 1.60 2 E 1532 15.32 15 
— F 1530 15.30 15 
10,000 | 100 100 G 162 1.62 2 
H 161 1.61 2 
I 160 1.60 2 
Example 6a J 159 1.59 2 
K 158 1.58 2 
State Po Quota 
u 
4 MF 10,000 | 100 100 
A 5189 51.89 52 No. of 
B 4400 44.00 45 Group Pop. Quota | Reps. 
c 138 1.38 1 
D 137 1.37 1 ABCDEF 9200 | 92.00 | 90 
E 136 1.36 1 GHIJK 800 8.00 | 10 
10,000 | 100 100 10,000 | 100 100 


Furthermore Example 7 shows that “nearness to the quota” with res- 
pect to groups of states is incompatible with “nearness to the quota” with 
respect to single states. 

In this example, the quota of the group of large states is 92, while the 
actual assignment to this group is only 90; and the quota for the group of 
small states is 8, while the actual assignment to this group is 10. A transfer 
of a representative from the small group to the large group (say from State 
K to State A) would bring both groups “nearer to the quota”; and yet no one 
would wish to make this transfer. 

In short, “nearness to the quota” cannot be taken as a test of a good assign- 
ment, either for a single state or for a group of states. 


REMARKS ON THE WILLCOX SLIDING DIVISOR 


The origin of the name “Method of Major Fractions” is to be found in 
an ingenious device known as the “sliding divisor,” and due, in its present form, 
to Professor Willcox. 


1928} APPORTIONMENT OF REPRESENTATIVES 97 


After an apportionment has been computed by the working rule for the 
Method of Major Fractions, the sliding devisor may be used to facilitate the 
recording of the results. This device,is supplementary to the actual com- 
putation and forms no essential part of it; it is interesting chiefly as explaining 
the origin of the name. 

The device consists in the selection of any number W such that 


>> >W> 
a—} a+} b-} b+4 


where a, b, c, etc., are the assignments of representatives to the States 
A, B, C, etc., according to Method MF. 

Such a number W, which may be called a Willcox Divisor, will always 
exist,* and will have the following property: If the population of each State 
is divided by W, there will be obtained a series of quotients such that, if 
one representative is assigned for each unit and for each major fraction 
(and also for each quotient which is itself less than one-half), the resulting 
apportionment will be precisely the same as the apportionment given by the 
working rule for the Method MF, and will therefore satisfy Test 2a. 

By the use of this device, the assignment given to any State can be figured 
out at once from the population of that State, as soon as the value of the 
Willcox divisor has been announced. 

It should be noticed, however, that the Willcox divisor is not the true 
value of the average Congressional District, and the Willcox quotients are not 
the true quotas of the several states; hence the occurrence of a major fraction 
in the “quotient” of a State gives that State no claim whatever to an ad- 
ditional representative, except the claim which is already implied by Test 2a. 
If a method could be found which would assign an additional representative 
for every major fraction in the érue quota, it would be indeed a simple and 
attractive method; but as we have seen, no such method is possible. 

The Willcox “sliding divisor” merely provides a convenient way of re- 
cording the result of the method based on Test 2a, and adds nothing (except 
the name) to the authority of that method. The simplest basis for a valid 
method is a direct comparison between competing States, as expressed in 
Test 1 or Test 2; and the only method which satisfies either of these simple 
and natural tests is the Method of Equal Proportions. 


Cc 
>W>—— etce., 


NOTE ON THE ALABAMA PARADOX 


The curious situation known as the “Alabama Paradox” is a further 
illustration of the confusion resulting from the unwise use of the exact 
quota of a state in computing the apportionment. 


* Except in the case of a “tie” between two states (see below). 


E. V. HUNTINGTON [January 


This paradox first came to the attention of Congress in the tables pre- 
pared in 1881, which gave Alabama 8 members in a House of 299, and only 
7 members in a House of 300, so that an increase in the total size of the House 
actually produced a decrease in the number of representatives of one of the 
states. 

The method in use at that time was known as the Vinton Method of 1850. 
This method assumed that each state was entitled to at least as many repre- 
sentatives as was indicated by the largest whole number contained in the 
exact quota of that state (with the special provision that no state should 
have less than one representative). To fill up the required total, further 
representatives were then assigned, “for fractions,” to as many states as 
necessary, the states being arranged, for this latter purpose, in a “priority list,” 
according to the magnitude of the fractions themselves, so that the state with 
the largest fraction was the first to receive an additional representative. 

The resulting paradox is illustrated in Example 8, where State C has 11 
representatives in a House of 100 members, and only 10 representatives in 
a House of 101. 

A similar defect occurs in the otherwise excellent Method of Alternate 
Ratios proposed by Dr. J. A. Hill in 1910. This method proceeds as in the 
Vinton Method, except that the “priority list” for fractions is arranged ac- 


cording to the magnitude of the quantity A/[x(x+1)]'/?, where x is the 
number already assigned to State A, and x+1 is the next larger number. 
The possible paradox resulting from this method is shown in Example 8a, 
where States G and H each lose one representative when the size of the House 
is increased from 100 to 101. 

No method can be regarded as satisfactory which is subject to the Alabama 
Paradox. 


Example 8 Example 8a 
Vinton Method (Paradox) Method of Alternate Ratios (Paradox) 


Pop. 


154550 
154500 
154450 


98 
100 i 101 | } 100 101 
Quota | Rep. | Quota | Rep. | 7 Quota | Rep.j Quota Rep. 
A 453] 45.3| 45 | 45.753] 46 30.91 | 30] 31.2191 | 31 
B 442] 44.2| 44 | 44.642) 45 30.90 | 30 31.2090 | 31 
Cc 105} 10.5| 11 | 10.605} 10 30.89 | 30] 31.1989 | 31 i 
7400} 1.48| 2] 1.4948 | 2 
1000 | 100.0} 100 | 101.000! 101 7350) 1.47| 2] 1.4847 | 2 
7300} 1.46] 2] 1.4746 | 2 
7250} 1.45| 2] 1.4645 | 1 
7200] 1.44] 2] 1.4544 | 1 
| 500000 | 100.00 | 100 {101.0000 | 101 i 


APPORTIONMENT OF REPRESENTATIVES 


NOTE ON THE CASE OF A TIE BETWEEN TWO STATES 


In applying the rule for the Method of Equal Proportions, the case of a 
“tie” between two states can occur only extremely rarely, that is to say, 
only when two “multipliers” (used in forming the priority list) happen to be 
For serial numbers up to k=100, this occurs 


commensurable numbers. 


only four times, as follows: 


k | 1/mult. k | 1/mult. k | 1/mult. k | 1/mult. 
25 | 10 (6)¥/2 49 | 28 (3)¥2 50 | 35 (2)? 81 36 (5)¥2 
3 (6)¥2 4 2 (3)"2 9 6 (2)"2 5 | 2 (5)¥2 
The corresponding “ties” for the Method EP are as follows: 
State Pop. | (I) (11) State Pop. | (I) (II) State Pop.} (I) (II) State Pop. | (ID) (ID 
A 10n | 25 24 C 14n| 49 48 E 35n|50 49 G 18n 81 80 
B in x @ D 1n s § F 6n 8 9 H 1n 4 § 
| 27 27 15” | 52 52 4in | 58 58 19n | 85 85 
n=4,5,6,-°°. n=5,6,7,°°°. 


In each of these four cases, assignment (I) is chosen rather than assignment 
(II) merely on account of the convention which provides that in case of a 


tie preference shall be given to the state having the larger population. 


On the other hand,in applying the rule for the Method of Major Fractions, 
the case of a tie may occur much more frequently. Thus, if ~, g, and m are 
any positive integers, the following assignments (I) and (II) will always be 
tied in the Method MF: 


State Pop. (1) (II) 
J (2p+1)n p+1 p 
K (2g+1)n 


2(p+q+1)n 


p+qt+l1 


1928] eC 99 
| 


100 E. V. HUNTINGTON 


We may even have a triple tie, as follows: 


Method MF 


Pop. (I) (II) 


11000 6 
7000 3 
3000 1 


21000 10 


While none of these tie cases is likely to occur in actual practice in Congress, 
the extreme rarity of the possibility of such a tie in the Method of Equal 
Proportions is a theoretical argument in favor of that method. 


APPENDIX I 


CRITIQUE OF TWO FURTHER CONFLICTING METHODS 


A third form in which the exact equality between two states may be 


written is (Pop. over) 
(rep. over) = (rep. under) ——————— 


(Pop. under) ’ 


where “Pop. over” and “rep. over” stand for the population and number 
of representatives of that one of the two states which is over-represented 
in comparison with the other, and “Pop. under” and “rep. under” stand for 
the population and number of representatives of the under-represented 
state. 

The (relative or absolute) difference between the two sides of this 
equation may be taken as a third measure of inequality between the two 
states, and may be called the (relative or absolute) “representation-sur plus” 
belonging to the two states. If we use the relative difference, we obtain a 
third test, which we may call Test 3 (not written out here in detail), which 
leads to the Method of Equal Proportions. If we use the absolute difference, 
we have the following less desirable test: 


Test 3a (not recommended). If the absolute “representation-surplus” 
belonging to any two states, that is, the value of 
(rep. over) — rad. ear [(Pop. over)/ (Pop. under) |, 
can be reduced by a transfer of a representative from one state to the other, then 
this transfer should be made. 


This Test 3a proves to be a “workable” test, and leads to a distinct 
method of apportionment which may be called the Method of Smallest 


January 
State (III) |° 

L 5 5 
M 4 3 
N 1 2 

10 10 


1928] APPORTIONMENT OF REPRESENTATIVES 101 


Divisors (SD). In comparison with the Method of Equal Proportions, 
Method SD favors the small states even more than does the Method of the 
Harmonic Mean (see Example 9). 

A fourth form of the exact equation, namely, 


(Pop. under) 


(rep. over) = (rep. under), 


(Pop. over) 
suggests, in a similar way, a Test 4, based on relative differences and leading 
to the Method of Equal Proportions, and a less desirable Test 4a, based on 
absolute differences, as follows: 


Test 4a (not recommended). If the absolute “representation-deficiency” 
belonging to any two states, that is, the value of 
(rep. over) |(Pop. under)/(Pop. over)|—(rep. under), 
can be reduced by a transfer of a representative from one state to the other, then 
this transfer should be made.* 


This Test 4a proves to be “workable,” and leads to another distinct 
method of apportionment which may be called the Method of Greatest Divisors 
(GD). In comparison with the Method of Equal Proportions, Method GD 
favors the large states even more than does the Method of Major Fractions 
(see Example 9). 


Example 9 Tests 3a and 3 Tests 4 and 4a 
Assignment of Representation-Surplus Representation-Deficiency 
Reps. 
HM 4—5C/B 6—3B/C 6A/3B—7 8B/A—5 
MF 
State Pop. SD EP GD SD EP EP GD 
A 726 7 7 8 7.000 5.939 
B 539 5 6 5 3.108 6.000 8.082 5.000 
c 335 4 3 3 4.000 4.827 
1600 | 16 16 16 = 
Absolute 0.892 1.173 1.082 0.939 
Relative 0.287 0.243 0.155 0.188 


The conflict between Tests 3a and 4a, which does not exist between Tests 


3 and 4, again confirms our belief that the relative difference is, for the present 


* Tests 3a and 4a were presented by the present writer at a meeting of the American Mathe- 
matical Society on February 25, 1922. 


— 
— 


102 E. V. HUNTINGTON [January 


problem, a more natural and useful idea than the absolute difference. The 
two conflicting Methods SD and GD may be regarded as on the same level 
of merit, as between themselves; but both of them are inferior to Methods 
HM and MF, and even more inferior to the Method of Equal Proportions. 


WORKING RULE FOR THE METHOD OF SMALLEST DIVISORS 


The working rule for Method SD is the same as the rule for Method EP, 
except that, in forming the “priority list,” the multi- 
pliers there given are replaced by those in the Method SD 
adjoining table. 

The proof that this rule satisfies Test 3a is as 
follows. Suppose, as before, that A has x+1 and B 1/1 
has y, and that State A is over-represented in com- 1/2 
parison with State B. Then the inequality between “ 
States A and B, measured according to Test 3a, is aie 
(x+1)—y(A/B). If, hypothetically, A had x and B 
had y+1, then the inequality would be (y+1)—2(B/A). Now from the 


construction of the “priority list,” A/x>B/y; hence 


No. Multipliers 


1+ (A+ B)y/B>1+(A+B)x/A; 


yt1+Ay/B>x+1+4+ 
whence 


(y + 1) — 2(B/A) > (x + 1) — y(A/B). 


WORKING RULE FOR THE METHOD OF GREATEST DIVISORS 


The working rule for Method GD is the same as the rule for Method EP, 
except that the table of “multipliers” is replaced by 
the table here given. 

The proof that this rule satisfies Test 4a is, . | Multipliers 
briefly, as follows: If A has x+1 and B has y, the 
inequality, according to Test 4a, is (x+1)B/A—y. nr 
If A had x and B had y-+1, the inequality would be 1/4 
(y+1)A/B-—x. Now from the construction of the — 
“priority list,” A/(x+1)>B/(y+1); hence wares 


Method GD 


hence 


1928] APPORTIONMENT OF REPRESENTATIVES 


(A + B)(y + 1)/B > (A + +1)/A, 
whence 

(y+ 1)A/B+ y+1>(x+1)B/A+2+1, 
whence 

(y + 1)A/B — x > (x+1)B/A — y.* 


COMPARISON OF THE FIVE KNOWN METHODS OF APPORTIONMENT 


The only known methods of apportionment which are “workable,” and 
avoid the Alabama Paradox, are the five methods described above, namely 
(in the order in which they favor the smaller states), Methods SD, HM; EP; 
MF, GD. 

Example 10 (which is a combination of Examples 1 and 9) gives a 
comparison of the results of all five of the methods. Examples 11 and 12 
are further examples which likewise separate the five methods. Of course in 
many cases, two or more of the methods will agree in their results. 


Example 10 Example 11 Example 12 


Number of Reps. Number of Reps. Number of Reps. 


SD HM EP MF GD SD HM EP MF GD . |SDHM EP MF GD 


The following table gives a summary of the working rules for the five 
methods, arranged in the order in which they favor the small states. 

It will be observed that the Method of Equal Proportions occupies the 
central position among the five methods, having no “bias” in favor of either 
the small or the large states. 


* The working rule for Method GD, except for the provision that every state shall have at 
least one representative, is the same as that devised by the Belgian, Victor d’Hondt, in 1885, as a 
practical method of computation; none of the theoretical tests (1, 2,3,4; 1a, 2a, 3a, 4a) were known at 
that time. On the history of the d’Hondt Method, see C. G. Hoag and G. H. Hallett, Proportional 
Representation, New York, 1926. 


103 

Pop. 

A 729 A A 906119 9 9 9 10 

C 539 6 6 § IC 555};6 5 6 5 6 & 

D 534 433 D 331913 4 3 3 3 

— | 2600 26 26 26 26 26000] 26 26 26 26 26 
3200 


E. V. HUNTINGTON 


Between any two states, A and B, the assignment 
A x+1 
B y 

tter than the assignment 


B 


B 
ytyt+l 

A B 
A B 
a+} y+} 
A B 


APPENDIX II 


CRITIQUE OF CERTAIN UNWORKABLE TESTS 


A fifth form in which the exact equation may be written is the following 
(using the notation explained above): 


rep. over Pop. over 


rep. under Pop. under 


and the (relative or absolute) difference between these two numbers might 
be taken as the measure of inequality between the two states. 
If we use the relative difference, the resulting Test 5 (which the reader 
may write out for himself) leads at once to the Method of Equal Proportions. 
If, on the other hand, we use the absolute difference, we have a Test 5a 
which is not merely less desirable but is absolutely “unworkable.” 


Test 5a (unworkable). If the value of the difference 
(rep. over)/(rep. under) — (Pop. over) /(Pop. under) 
belonging to any twe states can be reduced by a transfer of representatives from 
one state to the other, then this transfer should be made. 


In many cases this apparently plausible test fails to give any information 
as to which of several proposed apportionments is to be preferred. For example, 


104 [January 
A x 
B y+1 
provided 
A 
sp 
HM 
EP 
MF 
| 


1928} APPORTIONMENT OF REPRESENTATIVES 105 


if we attempt to apply this test to the apportionment of 16 representatives 
to the three states whose populations are given in Example 13, we find that 
assignment (1) is better than assignment (2), and that (2) is better than (3), 
and also that (3) is better than (1), so that no choice is indicated. 

Examples 14, 15, 16, 17, 18 establish in like manner the unworkableness 
of certain other tests, which will be listed below. 

All these results confirm again our belief that the use of absolute dif- 
ferences, instead of the more natural relative differences, in this problem, 
is not well advised. 


(rep. over) /(rep. under) — (Pop. over) /(Pop. under) 


Assignment of 
Reps. 4/5—C/B|6/3—B/C| 6/7 —B/A| 8/5—A/B\8/3—A/C\4/7—C/A 


(1) (2) @) (1) (2) (3) (1) 


0.857 
0.800 0.701 ‘ 
0.569 2.507 


0.231 0.156 0.173 0.160 


(Pop. under) /(Pop. over) — (rep. under) /(rep. over) 


Assignment of 
Reps. B/A—5/8 | A/B—7/6| C/B—3/6 | B/C—5/4 |A/C—7/4'C/A —3/8 


(1) (2) @) (1) (2) (2) (3) (1) 


0.527 


1.451 
1.250 0.375 


0.1400 0.200 0.147 0.152 


1/(rep. under)—[(Pop. over)/(Pop. under)]/(rep. over) 


Assignment of 
Reps 1/5— 1/3— 1/7-— 1/5— 1/3— 1/7-— 
(C/B)/4 | (B/C)/6 | (B/A)/6 | (A/B)/8 | (A/C)/8| (C/A)/4 


(1) (2) (3) (1) (2) (2) (3) (3) (1) 


- 2000 - 33333 
- 20000 
29614 


-03719 


Ex. 13 
op. 
A 762 7 7 8 0.571 
B 534 5 6 5 
C 304 0.399 
1600 16 16 16 0.172 
Ex. 14 
‘op. 
A 698 8 7 7 0.7650 | 1.3071 | a 
B 534 0.6250 | 1.1667 | 0.689 
C 368 3 3 4 0.500 
— | == 
1600 16 16 16 a 
Ex. 15 
op. 
A 7496 7 7 8 | 14286 
B 5340 5 6 5 | 
C 3164 43s 3] | .10552 
16000 16 16 16 | .05187| .05204 .0242 .0245 | MM | .03734 


106 E. V. HUNTINGTON [January 
[(Pop. under)/(Pop. over)]/(rep. under)—1/(rep. over) 
Ex. 16 Assignment of 
Reps. (B/C)/5 | (C/B)/3 | (A/B)/7 | (B/A)/S | (C/A)/3| (A/C)/7 
—1/4 —1/6 —1/6 —1/8 -—1/8| -1/4 
Pop. (1) (2) (3) (1) (2) (2) (3) (3) (1) 
7139 .1910 . 1496 . 1644 . 2896 
B 5339 5 6 5 30318 | .21989 . 1667 .1250 
e 23 3 25000 | .16667 .1250 . 2500 
16000 16 16 16 05318 | .05322 .0243 .0246 .0394 -0396 
Assignment of \(rep. larger)/(rep. smaller) — (Pop. larger) /(Pop. smaller)| 
Ex. 17 Reps. 
B/C—5/4| 6/3—B/C} A4/B—7/6| 
Pop. (1) (2) (3) (1) (2) (2) (3) (3) (1) 
A 737 a 1.380 1.600 2.667 2.240 
B 534 ¢© § 1.623 2.000 1.167 1.380 
C 329 S.2 4 1.250 1.623 2.240 1.750 
1600 16 16 16 0.373 0.377 0.213 0.220 0.427 0.490 
|(rep. smaller)/(rep. larger) —(Pop. smaller) /(Pop. larger) | 
Ex. 18 | Assignment of 
Reps. B/A—5/8| 6/7—B/A\ C/B—3/6| 4/5—C/B\4/7—C/A|C/A—3/8 
Pop. (1) (2) (3) (1) (2) (2) (3) (3) (1) 
A 721 7406 .8571 .479 
B 534 .6250 . 7406 .800 
3 3 4 .500 .646 .479 .375 
1600 16 16 16 1156 .146 .154 .092 .104 


SUMMARY OF TESTS 1-32 AND TEsTs la-32a 


A systematic examination of all the ways in which the exact equation 
may be written suggests a total of 64 measures of inequality between two 
states (32 based on the use of relative differences, and 32 based on the use 
of absolute differences; see table below). 

The 32 tests based on relative differences may be called Tests 1-32, and 


1928] APPORTIONMENT OF REPRESENTATIVES 107 


all lead to the same Method of Equal Proportions.* The 32 less desirable 
tests based on absolute differences may be called Tests 1a—32a, and lead to a 
confusion of miscellaneous results, as exhibited in the accompanying table. 


UNDESIRABLE MEASURES OF INEQUALITY BETWEEN TWO STATES 


In Tests la-16a, In Tests 17a-32a, 
A=Pop. of over-represented state A=Pop. of larger state 
B=Pop. of under-represented state B=Pop. of smaller state 
B A BoA 
(1a) HM (17a) HM 
b a b a 
a b a b 
2 — MF 18 
(2a) A B 
A A 
3 ——b SD 19 — MF 
(3a) (19a) M 
B | 
(4a) 7 a—b GD (20a) la? —b MF 
aeaA 
5 Ex. 13 21 
(Sa) x (21a) Ex. 17 
a a 
(6a) sD (22a) | Ex. 17 
1a 
7 SD 23 — —-— 
Ba Ba 
8 ——-1 E 24 
(8a) 13 P (24a) Ex. 17 
A b A b 
9 1-— — EP 25 -—— 
(9a) 7: (25a) 1 = Ex. 18 
b b 
(10a) GD (26a) | Ex. 18 
a a 
1 1 b 1 1 b 
11 GD 27 — q 
(11a) (27a) Ex. 18 
B 6b 
(12a) Ex. 14 (28a) Ex. 18 
A a A a 
Ex. 15 29 — 
(13a) x (29a) HM 
Bil 1 Bi 3 
14 Ex. 16 30a 
Ab a (30a) Ab a 
(15a) Ba—Ab MF (31a) | Ba—Ab | MF 
1 1 1 1 
HM 32, — HM 
Ab Ba one Las Ba 


Note. (17a)=(1a), (18a) =(2a), (31a)=(15a), (32a) = (16a). 


* Tests 1-32 may be read immediately from the Table of Tests 1a-32a by dividing each difference 
by the smaller of its two terms. The resulting relative difference will be equal to (a/b)/(A/B)—1 
or 1—(A/B)/(a/b), according as A or B is the over-represented state. 


a 
4 


108 E. V. HUNTINGTON [January 


In this table, in Tests 1a—16a, A stands for the over-represented state, and 
B for the under-represented state; and in Tests 17a-32a, A stands for the 
larger state, and B for the smaller, while the vertical bars indicate that the 
absolute value of the quantity is to be taken, without regard to sign. Tests 
followed by “Ex. 13,” “Ex. 14,” etc., are “unworkable” tests, the proof of 
this fact being supplied in each case by the example cited. 

It may be noted that measures 17a and 18a are the same as measures la 
and 2a, respectively, while measures 31a and 32a are the same as 15a and 16a. 
Measures 17a, 29a, 30a, 32a differ only by a constant factor; and the same 
is true of measures 18a, 19a, 20a, 31a, and of measures 21a, 22a, 23a, 24a, 
and of measures 25a, 26a, 27a, 28a. 


A FURTHER BASIS FOR THE METHOD OF EQUAL PROPORTIONS 


It may also be noted that Tests 8a and 9a, although based on absolute 
differences, happen to lead to the Method of Equal Proportions. The quantity 
(a/b)/(A/B)—1, which occurs in Test 8a, and the quantity 1—(A/B)/(a/b), 
which occurs in Test 9a, may be called, respectively, the (absolute) ratio- 
surplus and the (absolute) ratio-deficiency belonging to the two states. If 
we use the term ratio-discrepancy to mean, at pleasure, the relative or 
absolute ratio-surplus, or the relative or absolute ratio-deficiency, belonging 
to two states, then the four tests 8, 9, 8a, 9a may be combined into a single 
criterion, as follows: 


Test 33. If the “ratio-discrepancy” belonging to any two states (that is, 
the relative or absolute amount by which (a/b)/(A/B) or (A/B)/(a/b) differs 
from unity) can be reduced by a transfer of a representative from one state to the 
other, then this transfer should be made. 


This Test 33, like all the Tests 1-32, leads directly to the Method of 
Equal Proportions. The original Tests 1 and 2 remain, however, perhaps 
the most satisfactory characterization of the Method. 


Briefly , the Method of Equal Proportions may be described as the only 
method which makes (1) the ratio of population to representatives and (2) the 
ratio of representatives to population, as nearly uniform as possible among the 
the several states. 


AppENpDIx III 


CRITIQUE OF METHODS BASED ON AVERAGE OR TOTAL ERROR 


All the discussion up to this point has been based on the idea of comparison 
between competing states, and all the tests so far considered may be called 


1928] APPORTICNMENT OF REPRESENTATIVES 109 


“comparison tests.” There is another possible method of approach to the 
problem, however, which should here be mentioned. This is based on the 
idea of computing some sort of average or total error for the whole ap- 
portionment, and selecting as the best apportionment that one whose total 
error is the least. 

There are two objections to this method of approach. In the first place, 
it is obvious that a total or average error might be reasonably small, while at 
the same time the error affecting some particular state might be shockingly 
large; and a gross injustice done to a particular state could hardly be 
successfully defended on the ground that “on the average” the other states 
are fairly treated. 

In the second place, when one actually tries to set up a definition for the 
total or average error, the multiplicity of possible formulas makes it extremely 
difficult to select any one as more significant than the rest. If g is the true 
quota, and r the actual number of representatives, of the ith state, then the 
error attached to that particular state may be defined in at least four dif- 
ferent ways: r—g, r/g—1, 1—q/r, 1/qg—1/r; and the total error may then 
be defined as the simple sum or as the weighted sum, of either the absolute 
values of the errors, or the squares of the errors; and the weighting factors 
may be chosen in a great variety of ways. Most of the resulting methods 
can be shown to involve the Alabama Paradox; the only ones which do not, 


lead about equally to the Method of Major Fractions and the Method of 
Equal Proportions. 
Thus, Method MF minimizes 


while Method EP minimizes 


As neither of these sets of formulas appears to have any obvious advantages 
over the other, it is difficult to make out a clear case for either the Method 
MF or the Method EP on the basis of the idea of total error.* 

Finally, it may occur to one to use, as the measure of error of the whole 
apportionment, not the sum of the errors of the several states, but the maxi- 
mum error with which any state is affected; the best apportionment being 


* See F. W. Owens, loc. cit., and E. V. Huntington, loc. cit. 


110 E. V. HUNTINGTON 


that one which has the smallest maximum error. As far as is known, all at- 
tempts to apply this principle lead to the Alabama Paradox. 

We are thrown back, therefore, on the simple comparison tests, the study 
of which reveals the substantial advantages of the Method of Equal 


Proportions. 


HARVARD UNIVERSITY, 
CAMBRIDGE, Mass. 


q 


CONDITIONS FOR ASSOCIATIVITY OF DIVISION 
ALGEBRAS CONNECTED WITH 
NON-ABELIAN GROUPS* 


BY 
JOHN WILLIAMSON 


1. Introduction. The problem of the determination of division algebras 
has been successfully investigated by Professor L. E. Dickson, who was the 
first to discover a division algebra D of order n* over a field F,and who recently 
has shown{ how to construct all algebras I’ of order n? =Q%q? over a field F, 
corresponding to the Galois group G of order »=Qq of an equation f(x) =0 
irreducible in F. In addition, he has determined the general conditions, 
called D,;, Dz, D3, which must be satisfied by the algebra TI if it is associative. 
He has also reduced these conditions in detail for an algebra I’ corresponding 
to a Galois group G of two generators, both when G is an abelian group, and 
when G is not abelian but of a special type. Based on his work, our problem 
is to reduce the associativity conditions, first when the group G is generated 
by two generators ©, and ©,, where 9, transforms 0, into some power of 0,; 
and second when G is generated by three generators 0,, 9,, and ©,, where 0, 
transforms 9, into some power of 0, and 0, transforms 0, and 0, into powers 
of ©, and 9, respectively. 

As this paper is a continuation of Dickson’s paper (these Transactions, 
vol. 28 (1926), pp. 207-234), direct reference is made to it throughout. 
Numbered lemmas, numbered theorems and formulas in square brackets 
refer to lemmas, theorems and formulas in his paper. The notation is every- 
where the same except that, for convenience in this paper, Q has been used 
for p, and 6 for 6. It is assumed that the reader has Dickson’s paper before 
him. 

The conditions D,, Dz, and D; are the formulas [53], [55], and [58] of 
Theorem 10: 


D, 6 = 5(0,)ae, 

Choro = Cher(Oq) Ou (kyr =1,--+,g—1; a = 1), 

(k=1,2,---,qg—1), 


* Presented to the Society, December 31, 1926; received by the editors January 28, 1927. 
t L. E. Dickson, New division algebras, these Transactions, vol. 28 (1926). 


111 


112 JOHN WILLIAMSON [January 


where there are Q—1 subscripts 0 under the last a and Q under the final 6. 


Part 1. ALGEBRAS I CONNECTED WITH A NON-ABELIAN GROUP 
GENERATED BY TWO GENERATORS 


2. The group G. Let G, be the cyclic group generated by @,; of order 
q, and let G, be extended to G by ©, where G, is of index Q under G. Then the 
Qth, but no lower than the Qth power of 9,, is a substitution of G,. If 
also @, transforms 0, into some power x of @,, then 

0;'0,0, =O: 

where e and ~« are integers less than q. 

Since G, is cyclic we may denote @,* by @,(& <q) and hence 
(1) 0;7'°0.0,* = O,** for all integers s > 0. 

But @,=0e8 and is commutative with @,, hence it follows from (1) 
with k=e and s=1 that 
(2) e(x — 1) =0 (mod q). 
For the same reason replacing s by Q and k by 1 in (1), we see that 
(3) 1 (mod q) 


and that ~x is relatively prime to g. Groups of this type exist; one such is a 
transitive group of order 16 with 0=2, g=8, x=5 and e=4. 
3. Algebra 2. The units j may be given the notation 


(4) jt = je, Je = = (k <4; s<Q), 
(5) jit Sie, 
where g and 6 are numbers ~0 of F(i). We also see that 

ko =kx (modq), =kx* (mod (ko<q, ho---.0<Q) 


where there are s zeros as subscript to k. Throughout this part of the paper 
Qz2e Will denote ax,..., where there are s zeros subscript to k. 

The subgroup G, is now cyclic. Hence by Theorem 1 the algebra 2 may 
be regarded as an algebra of order q? over the field F,, derived from F by 
adjoining all the symmetric functions of 7, 4,(7), - - - , @¢-:(i). This algebra 
is associative if g=g(@,).* Consequently, by Theorem 10, I is associative, 
if the conditions D,, D, and D; all hold and g =g(9,). 


* Loc. cit., §4. 


1928] ASSOCIATIVITY OF DIVISION ALGEBRAS 113 


4. Associativity conditions for’. Equation [6] gives the following 
formulas: 


(6) 
je =8ju, Ce =B k<q). 


The condition D, gives 
(7) 5 = 5(0,)a,. 


Let us now consider the condition D,. For any integer m>0, there exists 
an integer ¢,, 0<a,<q, and an integer ¢,,>0 such that t,.«=mq+c, and 
(tm—1)x<mq. We define t, to be 1. Hence a,, is the value of t,,, which is 
written for (tm)o. If tmai>k2tm, then k=tn+s, kx=mg+an+sx and 
ko=an,+sx. In the same way, if tay: >r2t,, r=tnt+v and ro=a,+0%. 

If k+r<q, by (6) and, if kotro<g, and (k+r)x 
=(m+n)q+rotko. Consequently and D, becomes 


But, if kot+ro=q, and becomes 


If we write k =1, that is m=0 and s =1, in (8) and (9) we get 


(10) = (o+ 1 < — te), 


(11) )g (o + 1 = — 
and (12) follows by induction from (10) and (11): 


(12) a= = ) eee (r = 1,2, eee =x 1). 


It is easily verified that equations (9) and (10) are satisfied identically, 

when the values for a;, a, and a, from (12) are substituted into them. 
When k+r=q, kotro=q and so while Hence D, 

becomes a,a,(6;**)g=g(8,), or on substitution for a, and a, from (12) 


(13) - - = g(A,). 
That g occurs on the left hand side to the power of x is easily seen. For 
(k+r)x = (m+n)qgt+ kotro=(m+n-+ 1)q, 
mtn+1=xz. 


114 JOHN WILLIAMSON [January 


If k+r>q, ci-=g and u=k+r—g. Then, as in the previous cases, 
= 1, k+r=tmint 5, U = tmin-z +O; 
=g, R+tr=tmintit = bmingi-z + 


On substituting for a,, a, and a, their values from (12) into D, and cancel- 
ling the terms common to both sides, we see that, when k+r>q, D2 reduces 
to (13). Hence we have the following lemma: 


Lemma A. The condition D, reduces for all values of k, r <q, to (12) or (13), 
where (12) merely serves to express a,(r=2,---,q—1) in terms of a. 


Next, let us consider the condition D;. Since X°=1 (mod q), jx,....=je 
(where there are Q subscripts 0) and, since j, and j, are commutative, d, in 
D; is equal to 1. Condition D; becomes 

Lemma B. The condition (14) follows for all values of k<q from 


To prove this lemma by induction, we assume that (14) holds for all values 
of k <k and, writing 0,* for 7 in (15), combine the equation thus obtained with 
(14). Since by [8] and (1) 


s=1 


But by the general formula D, this becomes 


s=1 Ck zs, 


(Since OF” =Ox,...9) is used to denote Where there are s sub- 

scripts 0.). Allthec’s in this product cancel except the first of the numerator 

and the last of the denominator, namely Cx,1(82) and ¢;2¢,29, each of which is 

equal to 1, since for the induction k<qg—1. Hence (16) is simply (14) with k 

replaced by k+1. As (14) holds for k=1 the proof of the lemma is complete. 
We have now proved the following theorem: 


TuHeoreM A. Lei f(x)=0 be an equation of degree Qq irreducible in F 
whose Galois group G is generated by ©, and ©,, such that O, is of order q and 
©, transforms OQ; into Of and O,°=0,°, while no lower than the Oth power of 


| 


1928] ASSOCIATIVITY OF DIVISION ALGEBRAS 115 


@, is equal to a power of @,. Excluding the case g=2, we see that G is not 
abelian and that x,e,q and Q must satisfy (2) and (3). The roots of f(x) =0 are 


=0,1,---,Q-—1 
6:*(0,"(i)) = 6.7 (0:**"(i)) 1 ) 


where 0,°(i) =i, 0,°(i) =6,"(i), and 6, and @, are rational functions of i with 
coefficients in F. There exists an associative algebra = whose elements are 


A = fot fiji t+ + 


where the f, are polynomials in i of degree less than Qq with coefficients in F, 
while 


so that the product of any two elements of = is another element of 2. Let 


A! = fo(Oq) + ex; 
k=l 
where a, is defined by (12). Then under multiplication defined by [20] the 
totality of polynomials in j, with coefficients in 2 form an algebra of order Q*q* 
over F, which is associative if and only if g =g(0,), 5 =5(0,)a., and (13) and (15) 
hold. 


Part 2. ALGEBRAS I CONNECTED WITH A GROUP GENERATED 
BY THREE GENERATORS 

5. The group G. Let the group G have the invariant subgroup G,, 
which is of the same type as the group G considered in §2, where G, has the 
invariant cyclic subgroup G, generated by 9, of order p, and G, is of index 
P under G, and is extended to G, by the substitution @,. Further, let G, 
be of index Q under G so that the Qth, but no lower than the Qth, power 
of ©, is a substitution of G,. Then, if ©, transforms 0, into ©,” and Q, into 
while 0, transforms 9, into we have 
(17) OF (e <p, a <p, < P), 
(18) 
(19) = 
(20) = 0,'", 


where a, b and s are integers >0. 


116 JOHN WILLIAMSON (January 


It follows from §2 that the substitutions of G, are represented uniquely 
in the form 0; = @sp4.=9,°0:* ()<P, a<p) and if g=Pp the substitutions 
of G in the form 9,,44=9,"O.(r <Q, k<q). As in §2 we see that 


(21) 1 (mod 9), 
(22) (x — =0 (mod 

If we write s=Q, a=1 in (19), it follows from (17) that 
(23) = (mod p). 
Similarly, from (17) and (20) with s=Q, we find that 
(24) b(z2@ — 1) = bmP, emb+e(x*—1) =0 (mod p)(b=1,---,P—1). 
But (24) is satisfied if , 
(25) 22—1=mP, em+e(x—1) =0 (mod p)(m integer > 0). 
In addition the transforms of 0° and @50; by ©, must be equal and also 
the transforms of @/ and @, by @,. Hence we have 
(26) e(z—1)=np, es y) =0 (mod integer > 0). 
Finally, since 

= 


= 


and, as x is relatively prime to #, y is relatively prime to p by (23). Hence 


Other conditions to be satisfied by the parameters e, ¢:, ¢2, x, y, and z may 
be deduced, but these are all that wil! be required. It is sufficient for our 
purpose that groups of this type do exist. For example, there is a transitive 
group of order 32 in which p=4, P=4, Q=2, e=2, e,=2, e=0 and 
r=y=2=3. 

If k=a+bp(a=0, cer, p—1;b=0, 1, P-1), then Roo...0=Goo---0 
+boo...op where doo...o<p and =ay* (mod p), boo...o<P and =bz* (mod P) 
and there are s subscripts 0. With these values of k and ko, the units and 
constants of multiplication of I are given by formulas [49], [50] and [52], 
where #, e and @ are replaced by Q, e’ and 6 respectively. 

6. The algebra 2. Thesubgroup G, being now of the type G considered 
in §2, the algebra 2, which by Theorem 1 may be regarded as an algebra 
of order g? over the field F;, derived from F by adjoining all the symmetric 
functions of i, 0;(), - - - , 0¢-1(z), is of the type I considered in Part 1. If 


4 


1928] ASSOCIATIVITY OF DIVISION ALGEBRAS 117 


we substitute ~, P, 8 and p for g, Q, a and 6 respectively, all the formulas 
of Part 1 hold. Hence ® is associative if, and only if, 


= g(0:), 

p = p(6,)B., 

B - - - = g(6,), 

p = - - - 


By Theorem 10, if (28) holds, I is associative if and only if the conditions 
D,, D2, and D; all hold. In these conditions, as quoted in the introduction, 
we must now write e’ for e. 


7. Associativity conditions for . Condition D, gives 
(29) (eo esp). 
In the consideration of condition Dz, let 


k=bp+a 
r=spt+t s=0,1,---,P-1/. 


- If b=s=0, we see as in §4 that D, reduces to (30) and (31): 


(30) Aa = = ) (a 1,2, 1), 


(31) = - - - gv, 


where yi, ="p+a, and (t,—1)y<mp, while t,4:>a2t,.* 
Now, let a=¢=0 so that k and r are multiples of p and may be taken as 
kp and rp respectively. Hence we must consider the condition 


If ztm = MP +m, 2(tm—1) <mP(m=0,1, - - -,2—1)(@m<P),* 
then k=/,,+s and ks=mP+46, where b=sz+a,, <P. 

Since, by the second of (17), 0*=@30",t we must consider the value 
of em. As at the beginning of §4 we can find integers f, and a, 20, such that 
ef,=upta, and e(f,—1)<p where a,<p. Then, if f,4:>m=f,t+h2f,, 
Hence kpo=bp+a,t+he. Similarly, if r=t,+v, n=f,+w, 


* See the definition of ¢,, and a, at the beginning of §4. 
¢ If e=0 the work is exactly similar to that in §4. 


118 JOHN WILLIAMSON [January 


then rp)>=dp+a,+we, where d=vz+a,<P. We now require to consider the 
value of Since 


JkpoJrpo = 


sdythe -b -ay+we -d 


then 
where ¢=a,+a,+(h+w)e. 
For, since 
a, + we = ne (mod 9), 
(a, + we)x’ = nex® (mod p) 
and so by (22) 
nex = ne = a, + we (mod ?). 


In (33), Cop,ne Genotes Csp,s, where ne=f (mod p) and f<p, and later, to 
simplify the formulas, is often written for if and 
ee; =@,, even when a and ¢ are greater than p, and 6 and s greater than P. 
When b+d<P, and, if m+n is of the form f,,,+¢ and 
Chp,.rp, but, if c=[p, then m+n is of the form f,4,4:+¢ and 
= 

When b+d=P, j,**=pj{j}***, and from (33) we see that a factor g 
or g? Occurs in Ckp,,rp,, according as ¢+e=p or =2p; that is, according as 
m-+n-+1 is of the form f,4,4:+¢ or f,4,42+¢. Hence the complete values of 
Ckp,.rp, aS Obtained from (33) are given by 


(34) Ckpo.rp9 — 


where 
X 


1, if ts, m+n = fur tt, 

=g, iff k+r=tmin tS, Mtn = ft, 

= if R+ = $5, 
= if R+r=tmingi tS, 


= *)g?, if = tminti m+n+1= Sutera t 


Now, since j,j7.=8.jj,, we have 


(35) Cop.ne = Bne Bne(On) 


1928] ASSOCIATIVITY OF DIVISION ALGEBRAS 


and by (10) and (11) 


Bre = Be B (81°) (r ¥ fy), 
g 

Be Bor—1) (81°) (r = f,) 


For ex=e (mod p) and accordingly 
Hence, by (17), the second of (28), (35), and (36), 


where G,,=p p(0.) - - - p(02-"), and n=f,+w. 
When k+r<P, Cip,rp=1 and u=(k+r)p, and if we take k=1, D, by 
means of (34) and (37) becomes 


(36) 
Bre 


(38) ( : y 


where 

=1, 

= p(O2*™), mt1F 

= rt1 = fry. 
From successive applications of (38) we get* 
(39) Orp = - - p(O2-")g”, 


where r=1, 2,---, P—1; +0; n=f,+w. 

By means of (34) and the formula = = it can be shown 
that D, is satisfied identically when the values of ax», a-p and a, are substi- 
tuted from (39) into (32), for all values of k and r for which k+r<P. 

But, if k+r=P, Cip,.p=p and u=e. Hence 


(k+r)z= Pz, k+r=t,, 
and, since kz +0 (mod P) (kx P—1),z=m+n+1. If 
(40) z=fith (a, + he < p), 


A=y+v or pt+v+1 or w+v+2, and in all cases by (34) and (39) D, reduces 
tof 


If e=0, A=0, a,=1 and (41) becomes apay(Op") ++ p*=p(0,). 


119 


120 JOHN WILLIAMSON [January 


Similarly, if k+r>P, Dz reduces to (41) for all values of k<P, r<P. 
For, when k+r>P, =p and u=e+(k-+r—P)p. Now 


(hr—P) )C ag, pp = Let 


and by (26) D. becomes 


(42) Ce = (9g) Ces, (k+r—P) ap% , 
and, if 

k+r=itta (a, + az < P), 
then 


kt+r—P=t,+ 4. 
Hence, if s=f,+-n, where a,+ne <>, the left hand side of (42) is equal to 


Then, if 
s—2z=f,t+n' (a, +n’e < p), 


by (40) 


or 2”, 
and so o=A+y or according as = 1 Or g. The right hand 
side of (42) then becomes 


where 
X = - - 


On equating the two sides so obtained and cancelling the common factors, 


we get (41). 
We must now consider the general case of D2, where 
r=t+sp 


For simplicity in writing let 
je’ be defined as j, when Of’ = ©, and a’ > p >a, 


jy’ be defined as j5p;awhen 0,’ = @4,,4and b’ > P>b. 


Then 


j 
| 


1928] ASSOCIATIVITY OF DIVISION ALGEBRAS 


Hence 


To get the value of cx,,, we consider* 

which is equal to 
Since j,», may be of the form 777," we have 

or, since x*=x (mod p), 
(45) Ctys,bspJp = Cosp.tyjt™ 
Hence 


= Cbep, ty(O2") Cay 


= 


= 
We get as special cases of Dz, 
(47) Cay, tary Ca,tab(Og) ae, 


and 


(48) = Aa+bp = 


where (48) combined with (30) and (39) defines a, in terms of a and a,, and 
Cay,bsp = 1 Or g according as or =p, where 


ay=mptan (an<p), =sP+b, (<P), se=upts, p). 


* If e=0, Cay, mep=1 for all values of and m 


121 
where 


122 JOHN WILLIAMSON [January 


Making use of (47) and (48), and substituting for c,, and c,,,, their values 
obtained from (44), (45) and (46) in Dz, we get 


(49) Coen, ty = Cty zd, bepCbp, 


The w in the first of (47) may be of the form ¢+s and so the first of (47) 
is a case of D, that we are considering. But by writing a=v, )=0, and 
proceeding as in the general case, we reduce it to (49), where since }=0 
the formula corresponding to the first of (47) is now of the type (48). The 
second and third of (47) have been treated earlier. 

We now prove the following lemma: 


Lemma A. The formula (49) may be deduced for all values of b<P and 
t<p from 


Assume that (49) holds for all values of b<b and ¢Si, and consider (49) 
with ¢=1; that is 
(S1) = Cys, 
If we now write 6{ for z in (51) and multiply the left members of (51) and 


(49) together and equate the result to the product of the right members, we 
get 


where 


Now, 


yOP*)Coep, y dep Cy ) 


= b 
= Chep,ty Cop, C(t+l) yx bap, 
and 


am 
Cop. t+1 = Coad, 


Making use of these two results, we see that (52) becomes (49) with # re- 
placed by ¢+1, and so by induction (49) may be deduced from (51). 
Now, (49) with =x becomes 


(53) Crap, zy = Cysdit bap Cop,2(%)T, 


where 
T= 


ASSOCIATIVITY OF DIVISION ALGEBRAS 


C(b+1) pt = 


Chzep,zp C(b+1) sp.y yz, ep bps 


b+1 
= cy 2+1,(b+1) ep» 


when we combine (53) with (50), where oy is written for z in (50), we get 
(49) with b replaced by 6+1 and our lemma is proved. Since z<P, ¢y2..p=1 
and (50) becomes 


where 


We have now shown that the condition D, reduces for all values of 
k<q, r<q to (30), (31), (39), (41), (48), and (54) where (30), (39), and (48) 
merely express a;(k <q) in terms of a and ay. 

It remains to consider the condition D3. If jeji=dijijer, where jx =Je,--+4 
and there are Q subscripts 0, k’=a’+b’p, where a’=ay°=ax*: (mod p) by 
(26), and b’=bz°=bmP +5 by (24), and accordingly 


je = 
Also ¢.%=dcxe and D; becomes 
We shall now prove the following lemma: 
Lemma B. Condition D; follows for all values of k<q from (56) and (57): 
(S6) Cor 10 = C2, - - 


Since (55) holds for all values of k <q, it is true in particular for the two 
cases b=0 and a=0 respectively: 


(58) Ce’,a0 = Cat, aay (0,2-*) , 


If we write 


gov? = per 


1928] 123 
Since 
and | 


124 JOHN WILLIAMSON 


for i in (59), since 


= 


we have from (58) and (59) . 
(60) Co aCe’ = Caner, 002%) X, 


where 


= Ca bp(9,2) Gay 


Now, since meb+e,(x*—1)=0 (mod by (24), 
* (f Oand in F(i)). 


Hence, 


= 


(019) Cazes, e’Ca 


Ce’ Cera 


From this result remembering that ax#=ay® (mod and that 
we see that (60) becomes (55). By induction, in a manner similar to that 
used in Lemma B of §4, it can be shown that (58) and (59) are consequences 
of (56) and (57) respectively. In the proof we require the formulas 


Co at 1Cazts, 20, = Coral Cazes, (at1) 
Cop. (b+1) p, ) 
= Ce’ 0" » 
which can be deduced as in the previous cases. Since 
Cort = Cogp,1(91%) Coy, 200 ANG Coy, 20s = = Cz, 05 
(56) becomes 
But e2*P—1 by (26) and so c,.,=1 and (57) becomes 


[January 
| 


1928] ASSOCIATIVITY OF DIVISION ALGEBRAS 


In (61) 


and in (62), since 2? =mP+1, 
= 


Cme,e,2=1 or g, according as or >e, and e,x=¢ (mod p). 
We have now proved 


THEOREM B. Let f(x)=0 be an equation of degree n=QPp, irreducible 
in a field F, whose group for F is generated by three generators @,, 9, and O, 
described in §5. Then the algebra = is associative if and only if conditions 
(28) hold. The totality of polynomials inj, with coefficients in = form an algebra 
I’ of order n* over F which is associative if and only if conditions (29), (31), 
(41), (54), (61), and (62) all hold and © is associative. 


UNIVERSITY OF CHICAGO, 
Cuicaco, IL. 


125 


A GENERALIZATION OF TAYLOR’S SERIES* 


BY 
D. V. WIDDERt 


1. Introduction. In view of the great importance of Taylor’s series in 
analysis, it may be regarded as extremely surprising that so few attempts 
at generalization have been made. The problem of the representation of an 
arbitrary function by means of linear combinations of prescribed functions 
has received no small amount of attention. It is well known that one phase 
of this problem leads directly to Taylor’s series, the prescribed functions in 
this case being polynomials. It is the purpose of the present paper to discuss 
this same phase of the problem when the prescribed functions are of a more 
general nature. 

Denote the prescribed functions by 


(1) uo(x), ui(x), me(x),---, 


real functions of the real variable all defined in a common interval a<x<b. 
Set 


= Como(x) + + + Cnttn(x). 


It is required to determine the constants c; in such a way that s,(x) shall 
be the best approximation to a given function f(x) that can be obtained by 
a linear combination of uo, m,---, Un. Of course this problem becomes 
definite only after a precise definition of the phrase “best approximation” 
has been given. Various methods have been used, of which we mention the 
following: 

(A) The method of least squares; 

(B) The method of Tchebycheff; 

(C) The method of Taylor. 

In each of these cases the functions (1) may be so restricted that the 
constants c; are uniquely determined. The function s,(x) thereby determined 
is called a function of approximation. Having determined the functions of 
approximation, one is led directly to an expansion problem. Under what 
conditions will s,(x) approach f(x) as m becomes infinite? Or, when will the 
series 


* Presented to the Society, December 29, 1926; received by the editors in January, 1927. 
t National Research Fellow in Mathematics. 


126 


1928) A GENERALIZATION OF TAYLOR’S SERIES 


(2) so(x) + [si(x) — so(x)] + [se(x) — si(x)] +--- 


converge and represent f(x) in (a, b) or,in any part of (a, b)? 
There are two special sequences (1) that have received particular 
attention: 


(1’) 1,z,2, 
(1”’) 1,sin x,cos x,sin 2x,cos2",---. 


The following scheme will serve as a partial reference list to this field, and 
to put into evidence the gap in the general theory which it is hoped the present 
paper will in some measure fill. 

(A)* (B)* (C) 
(1’) A. M. Legendre P. L. Tchebycheff B. Taylor 
(1’’) J. J. Fourier M. Fréchet G. Teixeirat 
(1) E. Schmidt A. Haar 

The entry in the upper left-hand corner, for example, means that the 
series (2) becomes for the method (A) and for the special sequence (1’) 
the expansion of f(x) in a series of Legendre polynomials. It should be 
pointed out that the series studied by Teixeira were considered by him in 
another connection, and that no mention of their relation to Taylor’s series 
was made. 

It is found that if certain restrictions are imposed on the sequence (1), 
and if the functions of approximation are determined according to the 
method (C), then the general term of the series (2) may be factored, just as 
in Taylor’s series, into two parts ¢ng,(x), the second of which depends in no 
way on the function f(x) represented, the constant c, alone being altered 
when f(x) is altered. As in the case of Taylor’s series the constant c, is de- 
termined by means of a linear differential operator of order m. If further 
restrictions, Conditions A of §6, are imposed on the sequence (1), it is found 
that series (2) possesses many of the formal properties of a power series. 
If ¢ is a point at which s,(x) has closest contact with f(x), then the interval 
of convergence of (2) extends equal distances on either side of ¢ (provided 
that the interval of definition (a, b) permits). The familiar process of analytic 
extension also applies to this generalized power series. 

A necessary and sufficient condition for the representation of a function 
f(x) is obtained by generalizing a theorem of S. Bernstein. Then imposing 


* For references, see Encyklopidie der Mathematischen Wissenschaften, IIC9c (Fréchet- 
Rosenthal), §51. 

| Extrait d’une lettre de M. Gomes Teixeira 4d M. Hermite, Bulletin des Sciences Mathématiques 
et Astronomiques, vol. 25 (1890), p. 200. 


7 
127 


128 D. V. WIDDER [January 


further conditions, Conditions B of §10, it is found possible to represent an 
arbitrary analytic function in a series (2). It is shown that the conditions are 
not so strong as to exclude the case of Taylor’s series, and that sequences (1) 
exist, satisfying the conditions, and leading to series quite different from 
Taylor’s series. Finally the relation of the general series to Teixeira’s series 
is shown. 

2. The Taylor method of approximation and the existence of the func- 
tions of approximation. The Taylor method of approximation consists in 
determining the constants c; of s,(x) in such a way that the approximation 
to f(x) shall be as close as possible in the immediate neighborhood of a point 
of (a, 6), irrespective of the magnitude of the error | f(x)—s,(x)| at 
points x remote from #. More precisely, the constants c; are determined so 
that the curves y=f(x) and y=s,(x) shall have closest contact at a point t. 
If the functions f(x) and s,(x) are of class C™+(possess continuous derivatives 
of order m+1) in the neighborhood of x=#, then the curves y=f(x) and 
y =s,(x) (or the functions f(x) and s,(x) themselves) are said to have contact 
of order m at x =1 if and only if 


We now make the following 


DeEFINiTIon. The function 
= 
t—0 


is a function of approximation of order n for the point x=t if the functions 
ui(x) are of class C* in the neighborhood of x=t, and if s,(x) has contact of 
order n at least with f(x) at x =t. 


We shall have frequent occasion to use Wronskians, so that it will be 
convenient to introduce a notation. The functions v0(x), - - - , 
being of class C*, we set 


v(x) (x) (x) 

W [v0(x) ,0:(x), | = 

vo) (x) (x) - (x) 


In particular, for the functions of the sequence (1) we set 
W(x) = W[wo(x),mi(x), ,tn(x)]. 


1928] A GENERALIZATION OF TAYLOR’S SERIES 129 


We may now state 


THeoreM I. the functions f(x), uo(x), u:(x),-- +, un(x) are of class 
C” in the neighborhood of x=t, and if W,(t) #0, then there exists a unique 
function of approximation 
0 u(x) w(x) 


f (0) (t) - (t) 


of order n for x=t. 


The proof of this theorem consists in noting that the determinant of the 
system of equations 


f™(8) = + + + (k = 0,1, ---,n) 


is W,,(t), which is different from zero by hypothesis, and in solving the system 
for the constants c;. The values of the c; thus obtained give the above ex- 
pression for s,(x). 

3. Determination of the form of the series. In order to form the series 
(2) we need to know the existence of the functions of approximation of all 
orders. We shall assume then that f(x) and u,(x), i=0,1,2,---, are of 
class C® in the interval a<x<b. Moreover we shall assume* that W(x) >0 
in the same interval. This insures the existence of the functions of approxi- 
mation of all orders for an arbitrary point of the interval. We are thus led 
naturally to a set of functions (1) possessing what G. Pélyat has called the 
Property W. 


DEFINITION. The sequence (1) is said to possess the Property W in (a, b) 
if each function of the sequence is of class C* in a<x<b, and if W,(x)>0, 
i=0,1,2,--- in the same interval. 


We shall now be able to show that the series (2) has the form 
Coho(x) + + C2he(x) eee, 


* No gain in generality would be obtained by allowing some or all of the functions W;(x) to 
be negative. 

7 G. Pélya, On the mean-value theorem corresponding to a given linear homogeneous differential 
equation, these Transactions, vol. 24 (1922), p. 312. We have extended the definition to apply to an 
infinite set. 


130 D. V. WIDDER {January 


where the functions /,(x) depend only on the sequence (1) and on the choice 
of the point #, and not at all on the function f(x) to be expanded. The 
constants c,, on the other hand, are independent of x, but depend on the 
function f(x) and on the choice of the point ¢. It is this property of the series 
(2) that makes all the series under the method (A) of’ the introduction so 
convenient to use. The property is lacking for the method (B), and for this 
reason the Tchebycheff series are less useful in spite of their theoretical 
advantages. 

The direct factorization of [s,(x)—s,-:(x)] is attended with algebraic 
difficulties which may be avoided by means of the following device. Set 

o(x) = — 


Then by Theorem I 
(2) = f(D) = Sra), R= 0,1, 1, 
o(t) = 0, k=0,1,--- ,m—1, = — 
But ¢(x) by its form is a linear combination of u(x), w(x), , un(x), 
(x) = aouo(x) + aymi(x) + + 
Hence the constants a; must satisfy the equations 
O = douo™(t) + ayy (t) + + (2) (k =0,1,---,#— 1), 
- = (¢t) + aim (t) + + 
From these equations we see that ¢(x) must satisfy the equation 
uo(t) u(t) tUn(t) 


$(x)Walt) = — 


w(x) tn(x) 


The factorizaton of ¢(x) which we set out to perform is thus completed. 
For brevity we set 


uo(t) ++ Uy(t) 
ug (t) ui(t) u(t) 


uo(x) mlx) 


1928] A GENERALIZATION OF TAYLOR’S SERIES 131 


so that g,(x, ¢) is the function 4,(x) sought. For convenience in later work 
we have put into evidence the point ¢ chosen. We see that 

(4) — = [f() — SO] ga(z,t). 

Now by reference to the explicit form of s,_;(x) given in Theorem I it becomes 
clear that 

n t t t tart 


= £0) +( 


(n—1) 


W,-1(t) 
It will now be convenient to introduce a linear differential operator defined 
by the relation 


L, f(x) = 
By use of this notation equation (4) becomes 
Sn(x) Sn—1(%) Lif (t)gn(x,#), 
and the expansion of the function f(«) has the form 


(S) f(x) ~ + Lif()gilx,t) + +---, 
Lof(x) = f(x). 


Incidentally, we have proved the following formula: 
Laf(t) = f(t) — LofOgo™ (t,t) — — 
- t) 


4. The properties of the functions g,(x,?/) and of the operators L,. 
From the equation (3) defining the function g,(x, ¢) we read off at once certain 
properties. Considered as a function of x, it is evidently a linear combination 
of u(x), u(x), , un(x) satisfying the equations 


D. V. WIDDER 


The operator L,, is seen to be a linear differential operator of order n 
which annuls the first m functions of the set (1), and which satisfies the rela- 
tion 

L,x"| = 
The expanded form of L,f is 
= f(x) + pila) fr (x) + +++ + f(z), 


the coefficient of f‘” (x) being unity. 
The function g,(x, ¢) is the function of Cauchy* used in obtaining a 
particular solution of the non-homogeneous equation 


Ln+if(x) = p(x) 


from the solutions of the corresponding homogeneous equation. The 
particular solution of this equation vanishing with its first m derivatives at 
x =tis known to be 


f(x) = co dt. 


When L, operates on the functions g,,(x, ¢) the result is particularly 
simple. Since L, annuls the first m functions of the sequence (1), it follows 
that 

Lagm(x,t) =0, m<n. 
Let us also compute Ligm(x, ¢) forx=tand m=n. By means of the relations 
(6) we find that 


Lngn( x,t) | amt = —g,(x,#) +| ae) =1, 
Ox” 


o” 
Ox” Ox 


n—1 


(p = 1,2, eee). 
These propertiest may be summed up as follows: 
* E. Goursat, Cours d’Analyse Mathématique, vol. 2, p. 430. 


¢ An a priori discussion of the series in question might be made by starting with these formulas. 
They may evidently be used to determine the coefficients of the series formally. 


132 [January 


A GENERALIZATION OF TAYLOR’S SERIES 


0, 
(7) { 


1, m=n. 


The relation of the series (5) to Taylor’s series is brought out more 
clearly if the sequence (1) is replaced by the sequence (1’) in the preceding 
work. Simple computations show that for this case 


W,.(x) = n!(m — 1)'(m — 2)!--- 
L,f(x) = f(x), 
gn(x,t) = (x — 


The series (5) now has precisely the form of Taylor’s series. 

For many purposes it will be convenient to use another form of the dif- 
ferential operator L,. It is known* that if the Property W holds for the set 
uo(x), u(x), - ++ , Uns(x) in (a, 5), then L,f(x) may be written as 


8) L = do( ( — 


ad 1 d f(z) 


di(x) dx 


where 
W(x) W 
wie 
(bw 2,3, «>> 


do(x) = Wo(x), oi(x) = 


The functions ¢;(x) will all be positive for a<x<b since we are assuming 
that the Property W holds in that interval. The differential expression 
adjoint to L,f(x) may then be writtent 


In formulas (8) and (9) the operation of differentiation applies to all that 
follows. tf 


* For a simple proof of this fact see G. Pélya, loc. cit., p. 316. 

¢ L. Schlesinger, Lineare Differential-Gleichungen, vol. 1, p. 58. 

¢ Throughout this paper the independent variable for the operator L, is x; for Mn, ¢. The 
expression means Lnf(x)| 


1928] 133 


134 D. V. WIDDER [January 


The functions g,(x, #) can be expressed in terms of the functions ¢,(x). 
For g,(x, ¢), considered as a function of x, satisfies the differential system 


= 0, 


0, m=0,1,--- 
Lan) = { m=n 


The system has a unique solution since the boundary conditions are 
equivalent to 
0, 


1, m=n. 


But by virtue of formula (8) the solution takes the form 


u(x) = g,(x,t) = 
s(t) 
(10) 


t t 


a formula which we shall also write as follows: 


That this function satisfies the differential equation is obvious. That it 
satisfies the boundary conditions may be seen by forming the functions 


t t t 


Lign(x,t) = » 
- 
and substituting x =?. 
It is a familiar fact, and one that may be directly verified by use of 
formulas (6) and (9), that g,(x, ¢) considered as a function of ¢ satisfies 
the adjoint differential system 


(11) Mnziv(t) = 0, 
0, 


2) (— 1)", m 


= 
= 
o( x) z 71 

1(%1) 

= 0,1,2, 1, 

=n. 


1928] A GENERALIZATION OF TAYLOR’S SERIES 135 


But an argument similar to that given above shows that the solution of this 
system has the form 


(13) v(t) = gn(x,t) 


- 


This formula has the advantage over (10) that it enables one to express 
gn(x, ¢) in terms of gn_i(x, t): 


1 z 

pot) Jt 

By use of this formula the functions g,(x, 4) may be computed step by step 

from the functions ¢,(x), the computations involving only one new integra- 

tion for each new function g,(x, #). 

It should be pointed out that for many purposes it is convenient to con- 
sider the functions ¢,(x) as the given functions instead of the u(x). For 
if the ¢;(x) are given positive functions in (a, b), then a set of functions 
u;(x) possessing the property W in that interval is 

= gi(x,t) (Gg 

Evidently any function ¢;(x) may be multiplied by an arbitrary constant 

not zero without affecting the form of the series; for a glance at formulas 


(8) and (10) will show that neither the operators L, nor the functions 
gn(x, t) will be thereby affected. For the special sequence (1’) we have 


oi(x) = k (k = 1,2,3,---), 
$o(x) = 

However, one is led equally well to Taylor’s series by taking 
= 1 (k = 0,1,2,---). 


5. Remainder formulas. Let us begin by deriving an exact remainder 
formula, the analogue of a well known formula for Taylor’s series.* Set 


= f(x) — Lof(go(x,t) — Lif()gilx,t) — — 


By Theorem I this function has a zero of order (w+1) at least at x=1. 
Furthermore it satisfies the differential equation 


= Lnsif(x) 


* See for example E. Goursat, loc. cit., vol. 1, p. 209. 


136 D. V. WIDDER [January 


But it is known that the only solution of this equation vanishing with its 
first derivatives at x=1 is 


(14) = f 


This gives the remainder formula desired: 


f(x) = Lof(*)go(x,t) + Lif g(x, t) + + 


(15) 
+ f 


For the special sequence (1’) this becomes 


fla) = f{O+ J, = (dt. 
t 

In the previous section we assumed that the functions f(x) and u,(x) 
were of class C*. For the validity of the remainder formula (15) it is clearly 
sufficient to assume that f(x), wo(x), w(x), --- , un(x) are of class and 
that the Wronskians Wo(x), W:(x), --- , W.(x) are positive in (a, b). 

Let us now obtain remainder formulas analogous to certain other of the 
classical remainder formulas for Taylor’s series. Let F(s) be a function of 
class C’ in the interval (a, b), and such that F’(s) is not zero in the interval 
(t, x) except perhaps at the point ¢. Then formula (14) may evidently be 
written as 


* gn(x, 5) 
(s)ds. 


R,(x) = 
We may now apply the first mean-value theorem for integrals,* and 
obtain 


(16) R(x) = [F(x) — F(2)] (t<t< 4,2 <t <2). 


This is the analogue of the remainder given by Schémilch,t 


— 


to which it reduces for the special sequence (1). 


* E. Goursat, loc. cit., vol. 1, p. 181. It is to be noted that [gn(x,s) Lnyif(s)]/(F’(s)) may be 
discontinuous at s=#. The ordinary treatments of the theorem do not admit this possibility, but it 
may be shown that the theorem is still applicable to this case; cf. G. D. Birkhoff, these Transactions, 
vol. 7 (1906), p. 115. 

t For references see Encyklopidie der Mathematischen Wissenschaften, IIA2 (Pringsheim), 
§ 11. 


1928] A GENERALIZATION OF TAYLOR’S SERIES 137 


By specializing the function F(s) a variety of remainders may be obtained. 
Let us take 


F(s) = ff msn. 


Then F(s) obviously possesses the continuity properties imposed above. 
That it is a function of one sign in the open interval (¢, x) may be seen by 
direct inspection of formula (10) or by the general theory of G. Pélya.* 
For, by formulas (11) and (12) we see that g(x, s) considered as a function 
of s has a zero of order m at the point x and satisfies the differential equation 


M m+10(s) = (0. 


But no solution of this equation not identically zero can vanish more than 
m times in any interval in which the Property W holds. Consequently g,,.(x, s) 
is different from zero in (a, b) except at x. With this special choice of F(s), 
(16) becomes 

8n(x,£) 


(17) Rela) = 


This is the analogue of a remainder of Roche, t 
(x 
R, = (n+1) 
(x) nlm 


to which it reduces for the sequence (1’). 
By taking m=n, (17) becomes 


and this is the analogue of the familiar Lagrangef remainder. Finally 
by taking m=0 we obtain 
Sn(x,£) 


as the analogue of Cauchy’sf remainder, 


(x — &)"(x — 2) 


* G. Pélya, loc. cit., p. 317. 
Encyklopidie, II A2, loc. cit. 


138 D. V. WIDDER [January 


A simpler remainder which also reduces to that of Cauchy for the sequence 
(1’) is 
= (x — 2). 
Let us sum up the results in 


THEOREM II. Let the functions f(x), uo(x), u(x), , Un(x) be of class 
C*+1, and let the Wronskians W(x), Wi(x),---, Wa(x) be positive in the 
interval asx<b. Then if tis a point of this interval, 


(18) f(x) = Lof(Ago(x,t) + Lif(dgi(x,t) +--+ + Laf(tgn(x,t) + Ra(x), 


where 
uo(t) u(t) 
uc(t) +--+ ug 


to( x) +++ 
W [uo(x), - f(x) 
Wy-1(x) 
(k=1,2,--+ +13 Lof(x) = f(x)), 


Lif (x) = 


and where R,(x) has one of the forms 


f (dt, 


£) 
&m(x,£) 
(msn;t<t<x;x<t <8). 
The function 


Nn(x,t) = 


that appears in the remainder may be expressed in a different form, which 

will be useful in what is to follow. From the form of the function it is seen 

to satisfy the following differential system when considered as a function of x: 
=1, 


= 0 (k = 0,1,2, ,m). 


1 eee ‘ 


1928] A GENERALIZATION OF TAYLOR’S SERIES 139 


But the unique solution of this system may also be written in the form 


z z z z dx)™ti 


For the special sequence (1’) this is equal to (x—#)™*!/(m+1)!. 

6. Generalized power series. If in formula (18) is allowed to become 
infinite, a series of the form 
(20) Gogo(x,t) + aigi(x,t) +--- 


results. Before discussing the behavior of the remainder as m becomes 
infinite, we discuss the general properties of a series of this type, a series 
which evidently reduces to a power series for the sequence (1’). In particular 
if ¢=0 is a point of (a, b), we shall set 

n(x) = gn(x,0) (nm = 0,1,2, eee), 
As has already been observed, no change is made in the series if any function 
¢;(x) is multiplied by a non-vanishing constant. Consequently, no essential 


restriction will be introduced by the assumption, which will be made in the 
remainder of this paper, that ¢,(0)=1. With this assumpton we may write 


(21) = dole) f f f 


In order that the series (20) may retain many of the formal properties 
of a power series we introduce 


Conpitions A: (a) The functions $;(x) are of class C® in the interval 
axxsb; 
(b) o:(x) > 0 5b), 

_ Ms 
(c) lim 
where 
M, = maximum ¢,(x), m, = minimum ¢,(x) ina 
In the case of the sequence (1’), ¢;(x) is constant, and the Conditions A 


are surely satisfied. It is a simple matter to construct other sequences of 
functions satisfying the conditions. For example, take 


oa(x) = 
Then 


My = €*!", my = 


{ 


140 D. V. WIDDER [January 


and the conditions are evidently satisfied in any interval (a, b) however large. 
We are now in a position to prove 


THEOREM III. If the functions $;(x) satisfy the Conditions A in (a, b), 
and if the series 


(22) DX cngn(x,2), astsb, 


converges for a value x =xot of that interval, then it converges absolutely’in the 
interval |x—t|<|xo— t|, a<x<b, and uniformly in any closed interval 
included therein. If the sum of the series is denoted by f(x), then 


(23) Lif(x) = 


n=0 


(k= 0,1,2,---; |x < 290). 
Since the series (22) converges for x=%o, it follows that there exists a 
constant M independent of for which 
<< M. 
We are thus led immediately to a dominant series for (22), 
n( x,t) 
Cnfn(x,t) K M lg | 
n=0 | 


We now obtain a more convenient form for g,(x, ¢) by successive ap- 
plications of the mean-value theorem for integrals: 


x) - - - (x — 
--- 
iti 


gn(x,t) = 


Here the first line of inequalities holds if <x; the second if >. Now making 
use of the upper and lower bounds M, and m, of ¢, in (a, 6), we see that 


MoM,---M, 
< 
Mom, * “My n! 
Mom, My — ¢|" 
Xo,t 
nl 
M,.M,---M,\? — t\* 
Cn8n(x,t) K M ) | 
n=0 n=0 \ Mom, Mn | x 


n=0 


1928] A GENERALIZATION OF TAYLOR’S SERIES 


(— ) | 
—t 
of the dominant series has the limit |x—t|/|xo—¢| as m becomes infinite 


by Condition A (c). The first part of the theorem is thus established. It 
remains to show that the operation term by term by LZ; is permissible. Now 


do(t) (n — k)! 


The test ratio 


(24) 
Lign(x,t) 


caLlign(x,t) K M >> | 


n=k n=k | | 


(= | 


|xo—t|" (n — k)! 


Consequently the series (23) is uniformly convergent for |x—t| <r, a<x<b, 
where r< |x|. This is sufficient to establish the result stated. 

As a result of this theorem it follows that there exists an interval of con- 
vergence for the series extending equal distances on either side of ¢ (provided 
the length of the interval of definition (a, b) permits). In particular, the 
interval may reduce to a single point, or it may be the entire interval (a, 5) 
(which in turn may, in special cases, be the entire x-axis). The following 
examples will show that all of these cases are possible. Take ¢,(x) =e-*!*. 
Then 


>> (m!)%g,(x) diverges except forx = 0; 


n=0 


n'gn(x) converges for |x| < 1, 


n=0 


diverges for | x | 


D> gn(x) converges for all x. 


n=0 


141 
Hence 


142 D. V. WIDDER (January 


Theorem III has a further important consequence. If in equations (23) 
we set x=1/, we see that 


ce = Li f(t). 


Since the coefficients c, are uniquely determined by the values of f(x) and 
its derivatives at x=#, it follows that the development of a function f(x) 
in a series (22) is unique. 

7. A generalization of Abel’s theorem. If a series (22) has an interval 
of convergence (—r, r),* then by Theorem III it has a continuous sum in 
the interval —r<x<r. As in the case of power series the series may or may 
not converge at the extremities of the interval. We shall show that if (22) 
converges at r(or —r), then the sum of the series is continuous in the in- 
terval —r<x <r (or —r<x<r) by use of the following 


Lemma. If the functions ,(x) satisfy the conditions A (a), (b), then the 
determinant 


Sn—1(%) 
is positive or negative according as0<x<yor0>x>y. 


First it will be shown that A¥0. If A were equal to zero for two values xo 
and yo distinct from each other and from the origin, it would be possible to 
determine constants c, and ¢, not both zero such that the function 


(25) = Cign—i(x) + Cogn(x) 


would vanish at x» and yo. But gn-:(x) and g,(x) both vanish (n—1) times 
at the origin so that ¢(x) would have at least (w+1) zeros in (—a, a). 
This however is impossible. For, according to the general results of Pélya 
already cited, no linear combination of go(x), gi(x), - - - , gn(x) not identically 
zero can vanish (w+1) times in an interval in which the Property W holds. 
Hence 

It remains to discuss the sign of A. Regard y as fixed, so that A becomes 
a function of x alone. Evidently 


Sn—1(y) 
ga (y) 


We shall show presently that W(y)>0 for all values of y different from 
zero in (—a, a). This will be sufficient to establish the Lemma. 


* Throughout this section we assume that a< —r<r<b; ¢ is taken equal to zero for simplicity. 


A 


1928] A GENERALIZATION OF TAYLOR’S SERIES 143 
For, if x is allowed to approach a positive value of y through values less 
than y, then A/(y—x) remains a function of one sign (with the same sign 
as A), and approaches a positive value. The variable A must therefore 
have been positive. By allowing x to approach a negative y through values 
between y and zero, we see that 
A<0, y<2x<0. 


To prove that W(y)>0 throughout (—a, a) except at the origin, first 
note that 
W®(0) = 0 
(n—1) (n—1) 
§n-1 (0) gn (0) 
W*-2)(0) = (n) (n) (n) = 1 
&n—1(0) gn (0) &n—1(0) 
by virtue of relations (6). Hence 


s2n—1 


(2n — 1)! 


This shows that W(y) >0 for values of y sufficiently near the origin. But the 
same argument used above to show that A is different from zero may be 
used to show that W(y) is different from zero away from the origin. The 
Lemma is thus completely established. 

By use of this Lemma it is possible to prove 


THEOREM IV. Let the function $,(x) satisfy Conditions A in (—a, a), 
and let the interval of convergence of the series 


Cn8n(X) 


n=0 
be (—r, r). Then if the series converges for x =r (or x = —r),its sum is continuous 
in the interval —r<x<r (or —rSx<r). 


Since the series converges for x=r, then to an arbitrary positive e there 
corresponds a number m such that 


+ + | <€ (p = 1,2,3,---). 
Now by the Lemma the set of values 
80( x) g1(2) 82( x) 
gr) 


forms a decreasing set. Hence by Abel’s lemma* 


ee, 


* E. Goursat, Cours d’Analyse Mathématique, vol. 1, p. 182. 


if 
V(y) = 


D. V. WIDDER [January 


&m+p(") 


where M is the maximum of go(x)/go(r) in 0<x<r. Consequently the series 
converges uniformly in 0 <x <r, and represents a continuous function there. 
That the sum is continuous in —r<x<0 follows from Theorem III. A 
similar proof shows that if the series converges at —r, then the sum is con- 
tinuous in —r<x<r. 

8. Generalization of the process of analytic continuation. Let us first 
obtain formulas analogous to the binomial formulas 


n(n — 1) 


(26) (x — = — + m2 ( — 4) gn, 


2! 
n(n — 1) 
(27) = + nt A) + — 


Let ¢ and u be two distinct points of the interval (a, b). Since g,(x, #) and 
gn(x, u) are both linear combinations of wo(x), w(x), --- , n(x), 


u) = Cogo(x,t) + cigi(x,t) + +++ + 
The constants c; may be determined by use of formulas (7). The result is 
(28) x, = Logn(t, x,t) + Lign(t, u)gi(x,t) Lngn(t, u)gn(x,t). 


For the particular sequence (1’) this reduces to (26) with ‘=0, u=#, and to 
(27) with u=0, 

In order to generalize the process of analytic continuation we begin 
with a consideration of the double series 


coLogo(t, u)go( x,t) 
+ + ciLigi(t, u)gi(x,t) 
+ coLoge(t,u)go(x,t) + coLige(t,u)gi(x,t) + coLege(t,u) go( x,t) 


Let us suppose that the interval of convergence of the series 


(29) 


(30) >> CnSn( x, 


n=0 


144 
x 
go(r) 


1928] A GENERALIZATION OF TAYLOR’S SERIES 145 


is |jx—u|<r,a<x<b. It will now be possible to show the double series (29) 
absolutely convergent in a certain interval. The general term of that series is 


u)gi( x,t), OSksn. 
Assuming Conditions A, we may obtain an upper bound for this term as 
follows: 
MoMi---M, |x 


\ge(x,t)| 
mom, * Mk k! 


MoM,---M, |t — u|*-* 
= 


mom,:::m, (n— k)! 


Let xo be a point in the interval of convergence of the series (30). Then 
there exists a constant M independent of n for which 
| < M. 
Hence we have 


Mom,*** My — 
| > 
n! 


Mo*** Mn — ul" 


len | < 


Consequently, observing that 


n! la —t|*|t — 


nLgntt, <M 
\cnLegn(t, <2 ( — k)! |xo — 


Mom,*** Ms, 


We are thus led to a dominant double series, which will now be shown 
convergent under certain conditions. First form the sum of the mth row 
of this series: 


— u|* — k)! 


* Mn 


- M,\3 1 
) +] 


| x0 u| 
Then form the sum of the row values 


n=0 Mom,*** Mn lao — u | 


146 D. V. WIDDER (January 


The test ratio of this series is 


(—) + 
Mn lao — | 


and by Condition A (c) this has the limit 
«| + —¢| 


— | 


as n becomes infinite. Consequently the series (29) is absolutely convergent 
if 

+ |x—t| < |x—ul, 
The sum of the series may be obtained by summing by rows or by columns. 
In the one case, using formula (28), we find the sum to be the convergent 


series 


(31) 


n=0 


In the other case, the sum is found to be 


(32) Laf(t)gn(x,t), 


n=0 


where f(x) is defined as the sum of the series (31). That 


Li f(t) = > CnLign(t, 


n=0 
follows from Theorem III. We thus have two representations for f(x), the 
first of which, (31), holds in |x—x | <r, a<x<b, and the second of which, 
(32), holds in |x—t|<r—|t—u|, a<x<b. “Conceivably, series (32) may 
converge in a larger interval, in which case an extension or prolongation of 
f(x) would be at hand. We sum up the results in 


TueoremM V. [f the functions $;(x) satisfy Conditions A in (a, b), and if 


f(x) = ln—u| <r, 


then 
f(x) = Do Lif 


n=0 


for all x and t satisfying the relation 


IIA 
IIA 


1928] A GENERALIZATION OF TAYLOR’S SERIES 147 


9. A generalization of a theorem of S. Bernstein. We shall now ob- 
tain a necessary and sufficient condition for the representation of a func- 
tion f(x) in a series of the type in question. The method consists in 
generalizing a familiar theorem of S. Bernstein.* The results to be proved 
are stated in 


THEOREM VI. Let the functions $; satisfy Conditions A in (a,b). Then 
a necessary and sufficient condition that a function f(x), defined in the interval 
a<x<b, can be represented by a series 


(33) f(x) = Li 
n=0 
is that f(x) be the difference of two functions of class C® in asx <b, 
f(x) = o(x) — 
such that 
L,¢(x) > 0 or d(x) = 0; LaW(x) > 0, or =0, a<x<d 
(n 0,1,2, ). 
We begin by proving the necessity of the condition. We suppose that 
f(x) = 
n=0 
By Theorem III this series is absolutely convergent in a<x<b, and hence 


we may set 


= |Lnf(a) |gn(x, a), 


n=0 


W(x) = { |Laf(a)| — Laf(a)} gn(x,a), 


f(x) = o(x) — v(x), 


Again using the results of Theorem III, we have 


Lid(x) = |Laf(a) |Lega(x, a) (k =0,1,2,---), 


n=0 


= { |Laf(a) | — Laf(a)} Legn(x, a). 


n=0 


*S. Bernstein, Sur la définition et les propriétés des fonctions analytiques d’une variable réelle, 
Mathematische Annalen, vol. 75 (1914), p. 449. 


asx<b. 


148 D. V. WIDDER [January 


By reference to (24) it is seen that every non-vanishing term of each of these 
series is positive throughout the interval a<x<b. The necessity of the con- 
dition is thus established. 

Conversely, suppose that f(x) =¢(x) —y(x), where ¢(x) and ¥(x) satisfy 
the conditions of the theorem. It will be enough to show that ¢(x) can be 
represented in a series (33), for a similar proof will apply to ¥(x); and, since 
the operators L, are linear, we will then obtain a representation of the form 
desired for f(x) by subtracting the series for ¢(x) and (x). ? 

We suppose that ¢(x) is not identically zero, for otherwise the result is 
obvious. Choose a point x» of the interval a<x <b, and consider the following 
exact remainder formula: 


= Lod(t)go(xo,t) + Lid()gi(xo,t) + + Lad(é)gn(xo,#) 
+ 


where a<i<xpo. Since the functions ¢,(x) are all positive, the functions 
g.(%o, ¢) are all positive. By hypothesis L,4:¢(¢) is positive. Consequently 
the above integral is surely positive, as is each term on the right-hand side 
of the equation. Hence 

$( x0) Lid(t)gn(xo,t), 
(Xo) M,:--M, 


< (xo) . 
Mom, Mn (xo t)* 


(34) < 


Now referring to Theorem II and to formula (19), we see that 


$(2) = Lop(t)go(x,#) + Lid(t)gi(x,t) +--+ + + Rn, 
z z z (dx)**" 
t<i<z, 


bn(En) (x — i)" 


R, = 


Setting ¢=£ in (34), we have 


(xo — 


|Ra| < 


1928] A GENERALIZATION OF TAYLOR’S SERIES 


Evidently the remainder approaches zero as m becomes infinite if 
xo i 
| x - t| < ’ as. 
2 
If now ¢ is allowed to approach a, the following expansion results: 
xo @ 


(35) o(x) = >) O<x-a< 


n=0 
But the series (35) converges in a larger interval. For 


/Mo---M,\? — gis 


n=0 n=0 


Mo*** My 


and the dominant series converges for |x—a|<x )—a. It remains only to 
show that the sum of the series is (x) throughout the interval a<x<b. 

Denote the sum of the series (36) by H(x). Then H(x)=¢(x) for 
a<x<(a+x»)/2. Choose a point ¢ in this interval near to (x»+a)/2. We 
have seen above that 


= (A gn(x,t) = |x —t| < 


But by Theorem V 


H(x) = >> |x t| <m—t, x2a. 
n=0 

Consequently H(x)=¢(x) for a<x<(xo+é)/2. Now choose a point ¢’ in 
this interval near to (x»+#)/2, and proceed as before to show that H(x) =(x) 
in a<x<(x+?’)/2. By continuing the process we see that H(x) and ¢(x) 
coincide in the entire interval a<x<x». But xo was an arbitrary point of 
a<x<b. Consequently equation (33) holds in this interval, and the proof 
is complete. 

10. The expansion of an arbitrary analytic function. After imposing 
further conditions on the functions ¢;(x) it will be found possible to represent 
an arbitrary analytic function in a generalized power series. We define 


Conpitions B. (a) Conditions A are satisfied in (a, b); 


d* 1 
b 20 5). 


We now state a very simple lemma, the proof of which follows immediately 
from Leibniz’s rule for the differentiation of a product. 


149 
= 
—, £2ea. 
r= n=0 


150 D. V. WIDDER [January 


Lemma. [f f(x) is positive with positive derivatives of all orders, and if o(x) is 
positive with derivatives of all orders that are positive or zero, then (d/dx) 
* (f(x) - (x)) ts positive with all its derivatives. 


We are now in a position to prove 


THEOREM VII. If Conditions B are satisfied in (a,b), and if f(x) is 
analytic in a<x <b, then 


f(x) = a<t<b, 


n=0 
the series being convergent in some neighborhood of t. 
Since f(x) is analytic at ¢, it can be represented as a power series 


f(x) = t| <r. 


n=0 


Then it follows that the expansion 


is certainly valid in the interval t—r <x 


3) 


n=O 


Now set 


so that 
f(x) = g(x) — A(x), 


g(x) and h(x) being functions that are either identically zero or positive with 
all their derivatives in t—r/3<x<it+1r/3. The trivial case in which g(x) 
or h(x) is identically zero may be discarded. Now by making successive 
applications of the Lemma it is seen that 


r r 
Lng(x) > 0, Lrh(x) > 0 (» = 


Consequently Theorem VI may be applied to give 


A GENERALIZATION OF TAYLOR’S SERIES 


r 
g(x) = Las( -— 


n=0 3 3 
Finally, we make use of Theorem V, and see that 


g(x) = Lng(t)gn(x,t), 


n=0 


h(x) = > Lyh(t)gn(x,t), 


n=0 


f(x) = |x —t| <r. 


The theorem is thus established. 

It should be pointed out that Conditions B are not so strong as to exclude 
the case of Taylor’s development. For, they are surely satisfied for ¢,(x) =1. 
Moreover, other sets of functions ¢,(x) exist satisfying the conditions. 
Witness the set 


on(x) = 


11. Teixeira’s series. In the introduction reference was made to certain 
series studied by Teixeira. We wish to show by a consideration of the sequence 
(1’’) how these series arise naturally as a generalization of Taylor’s series. 
In order that the Wronskians W,(x) may all be positive we change the sign 
of certain of the functions of the sequence (1’’), an alteration that will not 
affect the form of the series. Consider then the sequence 


(37) 1,sin x, — cos x, — sin2x, cos2x,--- ,( — 1)™"sin nx, 
( — 1)"cos mx, --- 


The operators Len41 corresponding to this sequence have a particularly simple 
form: 


Longs = D(D? + 12)(D? + 2%) ---(D?+ m2) =0,1,2,---), 


where D indicates the operation of differentiation. The operators L., are 
more complicated. Direct computations show that 


Won = n![(2n — 1)!]?[(2n — 3)!]?-- [3!]2. 


1928] 151 
r r r 

3 3 3 


152 D. V. WIDDER 


By definition of the operator L2,4; we have 


— 1)" sin (n + 1)x 


W(1, sinx, cosx, ,sinmx, cosmx,(—1)"sin(m+1)x) Weng 
= = , 


Won W on 


whence 
= + 12) -- - (D? + n*)( — 1)" sin + 
W 2n(2m + 1)! cos (w + 1)x. 


Hence the Wronskians W,,(x) are all positive at the origin, and the functions 
of approximation, g,(x), all exist. We shall show that 


n 
n(x) = 1 — cos z]*, 
By a familiar formula of trigonometry we have 
—1)*cosk 
(2n)! (n — k)'(n + k)! 
This function clearly satisfies the differential equation 


(38) Lonyiu(x) = 0. 


[1 — cos x]"sin x. 


Moreover, it satisfies the boundary conditions 
(39) u®(0) = 0 (k=0,1,2,---,2m—1); wu@”(0) = 1. 


But the differential system (38) (39) has only one solution, the function of 
approximation go,(x). 
By noting that 


2" d 
—[1 — cos x]*sin x = 


— ——_[1 — cos 
(2n + 1)! dx (2n+ 2)! 


att (— 1)**' ksin kx 


it is seen that this function satisfies the system 
Lan+20(x) 0, 
v®)(0) = 0 (k=0,1,2,---,2m); = 1, 
and consequently must be gon4i(x). 


In the expansion of the function f(x), the coefficients of the terms gon(x) 
will involve the complicated differential operator Z2,. We may, however, 


{January 


1928] A GENERALIZATION OF TAYLOR’S SERIES 153 


express this coefficient in terms of a simpler operator of order 2m. In doing 
this use will be made of the functions ¢,(x) which will now be computed: 


go(x) = 1, oi(x) = 


don(x) = —» dengi(x) = 2(2m + 1) cos (m + 1)x cos mx 
cos? nx 
(n = 1,2,3,---). 


Evidently, 


Longs f(x) 


_ (cos (m + 1)x)DLongif(x) + (m + 1)(sin (m + 1)x)Langi f(x) 
cos + 1)x 


Consequently it follows that, 
Len+2f(0) = D*(D* + 1*)(D* + 2%) - - (D? + n*)f(0). 


The expansion of f(x) now takes the form 


[1 — cos x]"sin x, 


f(x) — cos + 


A, = D*(D* + 1*)(D? + 2?) - - - (D? + (mn — 1)*)f(0), 
B, = + 1*)(D? + 22) -- - (D? + n*)f(0). 


Although the sequence (37) does not satisfy the Conditions A directly, 
a simple substitution reduces the series (40) to one for which these conditions 
are satisfied. Indeed we shall see that the substitution y=sin (x/2) reduces 
the series to the sum of two Taylor’s series, so that the convergence can be 
easily discussed. 

An alternative form of the series is obviously 


~ 


x 2nt+1 / x 
+B 


If the change of variable x/2 =y is made, the form of the series employed by 
Teixeira* is obtained. 


* For reference see § 1. 


bond n Qn 
2n + 1)! 


154 D. V. WIDDER [January 


Now any function f(x) analytic in the neighborhood of x =0 can be ex- 
panded in a series of this type for a sufficiently small neighborhood of x =0. 
For, if 

(x) + f( — x) f(x) — f( — x) 
¢(x) = Y¥(x) = +(x) = f(x), 


9 ? 


then the functions 
¥(2 sin-! y) 


¢(2 sin-! y) and - 
cos sin y 


are both analytic in some neighborhood |y|<é of y=0. Hence they can be 
expanded in powers of y: 


¢(2 sin? y) = amy", |y| <5, 
n=0 


cos sin-! y 

We have now only to make the substitution y=sin(«/2) in these series and 
to add in order to be assured that f(x) can be expanded in a series (40) in 
some neighborhood of the origin. For simplicity expansion have been con- 
sidered in the neighborhood of the origin, but the results clearly hold for 
an arbitrary point. 


Bryn Mawr COLLEGE, 
Bryn Mawr, Pa. 


A PROBLEM IN THE CALCULUS OF VARIATIONS 
WITH AN INFINITE NUMBER OF 
AUXILIARY CONDITIONS* 


BY 
R. G. D. RICHARDSON 


INTRODUCTION 


The significance of the calculus of variations as a focal point of analysis 
has been emphasized by Hilbert and his school, and its intimate connection 
with the theories of mechanics, differential equations, integral equations, 
and quadratic forms in an infinite number of variables, has been used to the 
mutual benefit of all these disciplines. From one standpoint the problems of 
the calculus of variations may be regarded as problems of ordinary maxima 
and minima in a denumerable or non-denumerable infinity of independent 
variables; the imposition of a finite number of auxiliary conditions would 
then be equivalent to reducing the infinity of variables by a finite number. 
It is natural to inquire what will happen when a denumerable infinity of 
auxiliary conditions are imposed on the function involved in the integral to 
be minimized. In various branches of mathematics much light has been 
thrown on problems by a generalization from the finite to the infinite and 
it may reasonably be expected that there will be additional insight into the 
problems of the calculus of variations by the development of a similar ex- 
tension. 

This paper undertakes to make a beginning of such a study by treating a 
particular problem which has for its Euler condition a differential equation 
central in mathematical physics. Some of the results will appear as natural 
generalizations of criteria already known, while others seem in contradiction 
to them. 

The problem to be studied is intimately related to one discussed earliert 
by the author in which a finite number of auxiliary conditions were imposed. 
That discussion concerned the solutions of the equation 


(0.1) L(u) = (p(x)u’(x))’ + q(x)u(x) + rAk(x)u(x) = 0, 


subject to the boundary conditions 


* Presented to the Society, September 11, 1925; received by the editors July 14, 1926. 
t Das Jacobische Kriterium der Vaviationsrechnung und die Oszillationseigenschaften linearer 
Differentialgleichungen 2. Ordnung, Mathematische Annalen, vol. 68, p. 279. 


155 


156 R. G. D. RICHARDSON [January 
(0.2) u(0) = u(1) = 0. 


There are three distinct cases of the equation (0.1) which may be dis- 
tinguished as follows: 

(i). Orthogonal case. When k(x) is of one sign; for example, k(x) is positive 
or zero and equal to zero only at a finite number of points in the interval. 
The system (0.1), (0.2) has an infinite number of normalized characteristic 
solutions U,, U2,---, corresponding to the characteristic numbers 
. 

(ii). Polar case. When k(x) has both signs and g(x) <0. The system (0.1), 
(0.2) has an infinite number of normalized characteristic solutions Ui, Us, - - - 
corresponding to the positive characteristic numbers \:<A2:< --~- and an 
infinite number U_;, U_s, - - - corresponding to the negative characteristic 
numbers 

(iii). Complex case.* When k(x) has both signs and g(x) is positive in at least 
part of the interval. The system (0.1), (0.2) as in the polar case has two 
infinite sets of characteristic solutions and characteristic numbers. But, if 
q(x) is large enough and positive, a finite number of the characteristic 
numbers Au, ,Am,, A—1, * A—m, are complex, as are also the characteristic 
solutions. 

Exact theorems concerning the existence of extrema in the various cases 
are given in §3. In the other sections, however, unless explicit mention is 
made to the contrary the discussion concerns only the orthogonal case. The 
argument can generally be carried over to the polar ‘case as is occasionally 
indicated in the text or a footnote. In the complex case, the problems of 
the calculus of variations would ordinarily have no meaning. 

Intimately related to the differential equation is the calculus of varia 
tions problem 


(0.3) D(u) = [low — qu?|dx = min., 
0 


the minimizing function u(x) being subject to the boundary conditions (0.2), 
the quadratic condition 


1 

(0.4) f ku?dx = 1, 
0 

and the linear conditions 


* This case was treated by the author, Contributions to the study of oscillation properties of the 
solutions of linear differential equations of the second order, American Journal of Mathematics, vol. 
40 (1918), p. 283. 


1928] A CALCULUS OF VARIATIONS PROBLEM 157 


1 
(0.5) [ = 0 
0 


where U;(x) denotes the solution of the corresponding extremum problem 
with 7—1 linear conditions and which may be identified with the solutions 
U(x) of (0.1). The solution of the problem (0.3), (0.2), (0.4), (0.5) is then 
furnished by U,,(x) satisfying for \=A, the equation (0.1), to which the 
Euler condition of all the minimum problems for m=1, 2,--- may be 
reduced. From the equation 


1 1 
f (pUm? — qU,2)dx = re f kU,2dx, 
0 0 


easily derived from (0.1), it will be noted that the value given to D(u) by 
The Legendre condition 


(0.6) Hyy = 2p >0 


built up after the usual Lagrange method for the function H=pu’?—qu? 
+rku?+).7_;2uikU;, and the Weierstrass condition 


(0.7) E= p(u’ — »)? 20 


are satisfied not only by U,, but by all the other admissible solutions 
Umi, Ums2, ++ of the Euler equation (0.1). 

The chief interest naturally centered in the Jacobi condition, which 
excludes the possibility of the point conjugate to x =0 in the extended sense 
lying within the interval 0,1. This condition picks out from the infinite 
variety of functions U; automatically satisfying the Euler, Legendre and 
Weierstrass conditions, that particular one, Um, which minimizes the in- 
tegral D(u) under the conditions imposed. This it does by determining the 
number of oscillations of the function in this interval. In §2 of the present 
memoir important extensions are made in the discussion of the Jacobi 
condition. 

Although in ordinary problems of the calculus both a maximum and a 
minimum of the function are usually sought, this has not been the case 
heretofore in problems of the calculus of variations. This is for the good and 
sufficient reason that one or other of these is infinite; for example, the 
maximum in the problem (0.3), (0.2), (0.4), (0.5) is infinite; in fact the 
conditions (0.6), (0.7) are interpreted to mean that no maximum is possible. 
In contradiction to these considerations for the ordinary case, some of the 
problems proposed in this paper possess both maximum and minimum solu- 
tions. 


158 R. G. D. RICHARDSON [January 


Suppose there be added to (0.3), (0.2), (0.4), (0.5) the infinite number of 
linear conditions 


1 
(0.8) kU ju dx = 0 G=sti,---;s2m); 
0 

as is shown in §3 the minimum is not affected by the addition of these 
conditions, being furnished by U,, as before. But now a maximum of the 
integral under the same conditions enters and is given by U,. By computing 
the Legendre and Weierstrass conditions for the infinitely extended problems 
it is found that they have respectively the forms (0.6), (0.7) as before; this 
fits in well with the preconceived notions of a minimum but since these 
conditions in the same form appear with the maximum problem as well, 
their significance has, for the moment at least, disappeared. This is perhaps 
more immediately evident if s is chosen equal to m. The only function 
orthogonal to U; for i=1,---, m—1,m-+1,---, and subject to the con- 
ditions (0.2), (0.4) is Um; this function then furnishes both a maximum 
and minimum to the integral D(u), while criteria such as the Legendre and 
Weierstrass should, by all the rules of the game, be different for the two 
cases. In the treatment of the ordinary problem* the derivation of the 
Legendre condition is independent of other conditions such as the Jacobi; 
the same remark may be made concerning the Weierstrass condition as 
derived by the discoverer. It is noteworthy that the significance of these 
two criteria as independent conditions has vanished never to return so far 
as the problems of this paper are concerned. The Weierstrass necessary 
condition, however, is sometimes deduced on the hypotheses that the Jacobi 
condition is satisfied in the interval; and in that form, but for the minimum 
alone, it survives in the problem here discussed. Naturally the Legendre 
condition, which may be regarded as a less general form of the Weierstrass, 
must appear in the same réle. These conditions might well be listed also in 
some form in any set of sufficient conditions for a minimum of our problem. 
On the other hand for the maximum there would appear to be no conditions 
of the usual nature at all possible beyond the Euler equation. 

The expectation that the main interest of the new problem would center 
around the Jacobi condition concerning the conjugate point is fulfilled. For 
the minimum problem this criterion is placed along side of the Euler as 
fundamental. The generalized conjugate point must lie outside the interval 
for a minimum; for the maximum problem proposed it would then follow 


* For example, see Bolza, Variationsrechnung. This admirable treatise is a mine of information, 
and the author wishes to acknowledge his indebtedness to it. 


1928] A CALCULUS OF VARIATIONS PROBLEM 159 


as a condition that the conjugate point lie within the interval. Lying without 
the interval is a definite criterion and naturally serves as one of a series of 
sufficient conditions; lying within the ‘interval is a much more shadowy 
condition. Probably the number of conjugate points existing in the interval 
is significant, but such a criterion would seem to indicate not much more 
than the number of steps the maximum problem is removed from the 
minimum problem. 

It appears then that for problems with an infinite number of auxiliary 
conditions imposed on the function it is to be expected that a generalization 
of the Euler conditions will retain its importance for both sorts of extrema, 
and that the generalization of the Jacobi condition will be vital for scruti- 
nizing the various possibilities that present themselves as solutions of the 
Euler equation. For one sort of extremum the Jacobi condition will probably 
serve both among the necessary and among the sufficient conditions, while 
for the other sort its significance will be negative only. On the other hand 
it is to be expected that the conditions arising as limiting cases of the 
Weierstrass and Legendre conditions will, for one sort of extrema, be relegated 
to positions subsidiary to the Jacobi condition, and for the other be dropped 
out of consideration. 

One might go a step further in indicating the breakdown of necessary 
conditions in problems with an infinite number of auxiliary conditions. In 
relative maxima and minima of two quadratic forms a necessary and suf- 
ficient condition for the existence of an extremum is that one of these forms 
be definite; which one does not matter. In the present discussion it is not 
necessary that p be of one sign in order that the integral (0.3) have an 
extremum. For example consider the problem 


1 
f (1 — 2x) y’?dx = extremum, y(0) = y(1) = 0, 
0 


1 1 
ff f = 0 
0 0 


where only those functions (x) are to be considered which are continuous 
and the square of whose derivative is integrable. It may be noted that the 
only functions satisfying the auxiliary conditions are c sin mx+¢2 sin 27x, 
ci +c? =1. On setting this family of functions in the integral to be made an 
extremum, there results a quadratic form in the variables ¢,, c, from which 
with the relation c;?+c.2=1 the problem may be solved. It would appear 
that in this case none of the usual necessary conditions have any significance, 


160 R. G. D. RICHARDSON [January 


not even the Euler condition. In this respect the problem bears some re- 
semblance to the special case of the ordinary isoperimetric problem where 
the formal solution is a minimizing extremal for the integral involved in the 
auxiliary condition. 

Courant has shown* that if in the problem (0.3), (0.2), (0.4), (0.5) the 
linear conditions (0.5) be replaced by others more general 


1 
0 


where V,(x) are arbitrary continuous functions, and if the minimum (or 
lower bound) of D(u) be denoted by D(V, - - - , Vm—1) this minimum cannot 
be greater than that of the original problem. In other words \,, is a minimax, 
that is the maximum of D(Vi,-+- ; Vm-i) which is itself the minimum of 
D(u) under the conditions (0.2), (0.4), (0.9). Obviously U; furnishes a 
minimin \,, that isa minimum of D(V;,--- , Vm—i). If there are a denumer- 
able infinity of the conditions (0.9), there can be no minimax, but the minimin 
is still 

If the conditions (0.9) are divided into two groups 1,---,/—1; 
l,--+,m—1, the minimum of D(z) will still be a function D(V:, - - - , Vm-1); 
this may be maximized for V;, - - - , Vis, and minimized for V;,--- , Va, 
the function U; giving a minimaximin ),. 

In the case of both minimum and maximum of D(u) under the conditions 
(0.2), (0.4), (0.5), (0.8) the Euler equation obtained in a formal manner is 


m—1 
(pu’)’ + qu+rku — = 0 
1 


with solutions 


m—1 

representing an infinity-parameter family of extrema vanishing at x=0, the 
function (x, ) being a solution of the homogeneous equation (0.1) which 
vanishes at x=0. It is shown later that u;=0 for all the minimizing and 
maximizing extremals; in other words these extremals are solutions of the 
homogeneous equation (0.1). 

In discussing the necessity of the Euler equations it may be noted that 
in the infinite problem the variations which are admitted by the conditions 


* R. Courant and D. Hilbert, Methoden der Mathematischen Physik, p. 325. 


1928] A CALCULUS OF VARIATIONS PROBLEM 161 


(0.5) must be linearly dependent on Un, Umsi,--- , So that any function 
which cannot be expanded in terms of this partial set of orthogonal functions 
is barred from consideration. In the' problem of this paper admissible 
variations must be linearly dependent on U,,---,U,. The family may 
thus be written 


n= Bu(x,rA)+ 


It is significant that the only function common to this family of admissible 
variations and the family of extremals (0.10) is the minimizing extremal 
u(x, d). 

In many respects the problems of this paper resemble those of relative 
extrema in quadratic forms involving a finite number or infinite number 
of variables. The imposition of auxiliary conditions may be regarded as 
reducing the number of degrees of freedom; when an infinite number of 
degrees of freedom are taken away there may be a finite or an infinite number 
remaining. To pursue this notion further let us consider sin mx as a basic 
set of functions in terms of which an arbitrary function u(x) vanishing at 
«x =0 and x=1 is to be expanded in the interval, and set up the corresponding 
problem of relative extrema in quadratic forms in an infinity of variables. 
Set 


u(x) = Ux) = dMsinirz 
1 


1 


The problem is to determine the c’s so that the quadratic form 


1 
(0.11) f [ex ic; cos imx jc; cos jrx — q > sin 
0 i ‘ i 


ij 


is an extremum under the quadratic condition 

(0.12) f E «sin imx c; sin jus = > gic, = 1, 
0 i tj 

and the infinite number of linear conditions 

(0.13) E di sin ixx c; sin ju = = 0. 
0 i a7 


This leads formally to the problem of finding an extremum for 


5 

| 


R. G. D. RICHARDSON [January 


ij ij l ij 


subject to the conditions (0.12), (0.13) and on differentiation with regard 
to the c’s and the y’s gives rise to the linear equations 


(0.14) (es; + + Mi =0 
7 l i 


together with (0.13). In order that this infinity of linear homogeneous 
equations in c’s and y’s have a solution it is necessary that \ be a root of an 
infinite determinant consisting of four groups, each of infinite extent in both 
directions. This may be written 


(0.15) 
+ Agi €12 + hy jd; hyd; --- 
i i 


+ €22 + Ago2 > he ;™ hejd; 
7 


| 


7 j 
hyd; > 
i i 


The quadratic condition (0.12) fixes the multiplicative constant involved 
in the solution of the homogeneous equations. Since from the method of 
definition, gij=8ji, 4ij=h;; the determinant is symmetric. Since 
it is known in advance (§3) that both the maximum and minimum problems 
have solutions, the infinite-bordered determinant must have s—m-+1* roots 
\;. For these values the solutions of the linear equations (0.12), (0.14) 
furnish the various sets of c’s which give not only the solutions U,, U» of 
the problems but also the other functions Um4:,--- , Us-1. 

In studying these problems of maximizing and minimizing the quadratic 
form (0.11) under the quadratic condition (0.12) and the infinite number 
of linear conditions (0.13), the question naturally presents itself as to 


* For the minimum problem in the polar case the interesting situation develops that the de- 
terminant corresponding to (0.15) has an infinite number of roots A, each of which is known in 
advance, and for each of which the equations (0.14) have solutions. 


162 
| 0 0 
| 0 


1928] A CALCULUS OF VARIATIONS PROBLEM 163 


what is the condition (analogous in some respects to the Jacobi criterion 
in the calculus of variations) which picks out, in one case, U,, and in the other, 
U.,, from the various possibilities U,,+--,U,. For the same problem in a 
finite number of variables the author has derived this condition*; that 
discussion suggests an analogous theorem here. 

If instead of sin mx the functions U; are used as basic system the treat- 
ment is much simplified. As may be seen from the discussion in §3 all terms 
of the determinant (0.15) vanish except those in the main diagonal of each 
of the three non-zero divisions. 

To indicate the connection} with the theory of integral equations, denote 
by G(x, ) the Green’s function of the differential expression 


(0.16) M(u) = (pu’)' + qu, 
corresponding to the boundary conditions (0.2). Then the integral equation 
1 
0 
has the same solutions as the system (0.1), (0.2). 
On setting M(u)=h(x), we have from the known properties of the 
Green’s function 


(0.18) u(x) = — f 


On the other hand, integration by parts gives 


1 1 1 
(0.19) =— és = f f dt. 
0 0 0 
Thus the discussion of the extrema of the integral D(u) is reduced to that of 


the integral on the right of (0.19). If we multiply (0.17) by k(x) u(x) and 
integrate, we obtain the formula 


(0.20) = f f f a(x) dé = ¥R(u), 


and hence from (1.10) we have, when U; is a characteristic function, 


* Relative extrema of pairs of quadratic and hermitian forms, these Transactions, vol. 26, p. 491. 
+ Iam indebted to my colleague, Professor J. Tamarkin, for suggestions concerning the methods 
used in connection with these expansions. 


i 
é 
7 


R. G. D. RICHARDSON [January 
D(U,) = = AZ RCV). 


When q <0 the integral D is positive and hence the integral R is also positive. 

In §3 the three integrals D(u); Ko(u), and R(u) are discussed in regard 
to relative maxima and minima under an infinite number of linear auxiliary 
conditions. 

The Jacobi condition as discussed in $6 concerns the non-vanishing of an 
infinite determinant involving integrals. When any finite number of con- 
ditions are dropped from the set of linear conditions (0.8), the infinite 
determinant corresponding to the resulting problem has again no zero in 
the interval 0, 1, and it is a curious fact that its ratio to the original is a 
decreasing function throughout the interval. 

A portion of the discussion in this paper is too formal, omitting much in 
the way of justification of infinite processes. Since, however, the extrema 
actually exist, the main argument is correct and the briefer treatment has 
its advantages. 

It may be noted further that the linear character of all except one of the 
auxiliary conditions renders the treatment much simpler than it would be 
in the general case. In particular the analogons of the Legendre and 
Weierstrass criteria and of the Har ‘Iton function and Hilbert integral have 
very simple forms. 

The results of this paper as here given for the simple boundary conditions 
(0.2) may be extended without difficulty to more complicated cases. The 
treatment as given for one independent variable may be readily generalized 
to regions of two or more dimensions. With the exception of the process of 
taking the derivatives of the quotients of the determinants arising in the 
discussion of the Jacobi condition, all notions and methods go over almost 
without change to the more general problem. The interpretation of the Jacobi 
condition in terms of oscillation theorems for two or more independent 
variables, however, is obscure and difficult and has not been worked out. 


1. PRELIMINARY THEOREMS AND FORMULAS 


In this section we shall assemble some fundamental formulas for later 
reference and shall review some of the considerations of the paper* which 
treats the case of a finite number of auxiliary conditions. 

Basic for the argument is the self-adjoint differential equation of the 
second order 


(1.1) L(u) = (p(x)u’(x))’ + q(x)u(x) + Ak(x)u(x) = 0, 


* Loc. cit., Mathematische Annalen, vol. 68, p. 269 


1928] A CALCULUS OF VARIATIONS PROBLEM 


with the boundary conditions 
(1.2) u(0) = u(1) = 0, 


where p>0, and where p, g and & are analytic functions* of x in the interval 
0, 1 considered. 

The general solution am;(x, \)+8ue(x, d) of (1.1) contains two arbitrary 
constants besides the parameter X. Since the discussion of this paper concerns 
only the family through x =0, u may be chosen so as to vanish there and the 
solution may then be written 


(1.3) u = au;(x,d) 


where it is assumed for the sake of uniformity and without loss of generality 
that a>0, u’(0, A) >0. As || increases all the zeros of u;(x,d) (except that 
at x=0) move to the left. 

As noted in the Introduction, there are two important cases connected 
with the problems of the calculus of variations. In the orthogonal case there 
is an infinite set 


(1.4) Ui,U2, 
of solutions of (1.1) (1.2) and in the polar case there are two such sets 
(1.5) U1,U2, +++; U_1,U_2, 


Solutions can be considered orthogonalized and normalized: 


1 1 1 
(1.6) = 0(i ¥ j) kRU?dx =1 f kRU_2?dx = — 1 |. 
0 0 0 


For the orthogonal case, the equation (1.1) is the Euler condition for 
the calculus of variations problem 


1 

(1.7) D(u) = f — min., all) 0, 
0 

subject to the quadratic auxiliary condition 


1 
(1.8) f = +1. 
0 


* The main features of the discussion can be carried through under much less stringent condi- 
tions. 


4 

| 

165 

| 


166 R. G. D. RICHARDSON [January 

For, on setting 

(1.9) = f ku?dx, andhence — ku? = 0, = 0, (1) = 1, 
0 


an application of the Lagrange method transforms the relative minimum 
problem into that of finding an absolute minimum of the integral 


1 
f [ pu’? — gu? + r(vf — ku®)|dx 
0 


having for Euler condition the equation (1.1). 
The solution of the minimum problem must then be found amon g(1.4); 
from the formulas easily derived from (1.1), 


1 1 
(1.10) f (pu’? — qu?)dx = ku?dx, 
0 0 


it follows that the minimum value is one of the \’s. Since all the other 
conditions of the minimum problem are satisfied by any of the functions 
U;, U2,---, it must be the Jacobi criterion alone which determines that 
particular one, U,, having no zero within the interval. 

If the extremum problem (1.7), (1.8) is changed by the addition of 
the linear conditions 


1 
(1.11) Kis f =0 -++,m—1), 
0 
the solution U; is barred from consideration. On setting 
(1.12) f kU;udx, andhence vf — kU;u = 0, 0,(0) = 2,(1) = 0, 
0 


and considering the problem of minimizing the integral 


1 m—1 
f ES — qu? +r(od — ku?) + — kU; w lex, 
0 1 


the Euler equation takes the non-homogeneous form 


m—1 


(1.13) (pu’)’ + qut+rAku + = 0 
1 


with solutions 


ma! 
(1.14) u= au,(x,r) — 
1 A—A; 


1928] A CALCULUS OF VARIATIONS PROBLEM 167 


as may be proved by substitution. This family of curves may be used 
as the extremals of the problem. It is possible to show that for the minimizing 
extremal of this family all the w’s are zero. For, on setting the value of u 
from (1.14) in (1.11), using the boundary conditions and the relations 
(1.6), we find that 
1 ‘ 
Jo A— 

from which it follows that y;=0. The differential equation of the minimizing 
extremals is thus reduced from (1.13) to (1.1). The Jacobi condition selects 
the solution which is in this case U, with m—1 zeros within the interval. 

It should be noted that for some purposes, such as the Jacobi condition, 
it is well to interpret the family of extremals as being in higher dimensional 
space. By adding to the two dimensions xu of (1.14), a third vo given by 
(1.9) and m—1 more 2; given by (1.12), the extremals may be considered to 
be curves in the (m+2)-dimensional xuv9v; space. 

For the polar case, there are two sets of calculus of variations problems 
for which the equation (1.1) is the Euler condition. One is precisely that of 
the formulas (1.7) to (1.15); the other is set up by replacing the quadratic 
condition (1.8) by Ko= —1. 

‘Let f(x) be any function which vanishes at x=0 and x=1 and which 
can be represented in the form 


fla) = f 


where ¢(x) is integrable together with its square. It is known* that f(x) 
can be expanded in an absolutely and uniformly convergent series (in both 
the orthogonal and the polar case): 


(1.16) f(x) = ix), fi = sign 


the summation being taken over all the characteristic values. Substituting 
here f(£) =G(x, £) and observing from (0.17) that 


(1.17) Us (2) =r f (de, 


* J. Tamarkine, Probléme du développement d’une fonction arbitraire en séries de Sturm-Liouville, 
Comptes Rendus, vol. 156 (1913), pp. 1589-1591; L. Lichtenstein, Zur Analysis der unendlichvielen 
Variabeln, Rendiconti del Circolo Matematico di Palermo, vol. 38 (1914), pp. 113-166. 


i 
i] 


168 R. G. D. RICHARDSON 


it is readily seen that 


(1.18) 


| 


? 


the series being absolutely and uniformly convergent. 
If h(x) be an integrable function, on multiplying (1.18) by h(x) #(€) and 
integrating we get the bilinear formula 


(1.19) J J = n= fw dx. 


Let us now identify 4(x) with M(u) as defined in (0.16). If u(x) is a function 
whose first derivative is absolutely continuous in 0,1, then «’’(x) exists al- 
most everywhere and is integrable. Hence from (0.19) and (1.19) we 
have D(u)=>>(h2/|d;|), and since from (1.19), (1.17), and (0.18) 


hy = h(x) k = — , 
this may be written 
(1.20) D(u) = > | c?, cy =signd; 


We note also that from (1.16) there follows 
1 

(1.21) = f kftdx = >> signdsf2; Ko(u) = >> sign dc? . 
0 


Some relations between the various solutions of (1.1) are important and 
will now be developed. It may be noted that the function u(x, d) satisfies 


the equation 
du'\’ du(0) 
(1.22) + @ +) bu =o, =0 


On 
On multiplication of (1.1) for the characteristic number \,, and solution 
U. by du/dd and of (1.22) by —U,,, addition and integration, we obtain 
the relation 


Ou ou’ —An (7% Ou 1 
(1.23) ff +— f Wands, 
an BY Jo 


which for the special case u = U; and m=1 becomes 


(1.24) ~ = 


1928] A CALCULUS OF VARIATIONS PROBLEM 169 


If U., U; denote any two of the family (1.4) (or (1.5)) corresponding 
to the parameters A», A; it may be proved in a similar manner that they 
satisfy the identity 

Ar — Am 
(1.25) = f 
0 

If 7 is a continuous function vanishing at 0 and x;, on multiplication of 
L(u)=0 by 7 and integration by parts there results a relation D(u, ») 
=\Ko(u, between the polar forms D(u, 7) —qun)dx, K,(u, n) 
=f . kundx. Further, let 7 be an allowable variation in the interval 0, x 
for the problem (1.7), (1.8), (1.11); that is, let 


Ko(u,n) = 0, Ke Wands = 0; 
0 


then D(u, n) =0. 


2. AN EXTENSION OF THE FINITE PROBLEM. THE JACOBI 
CONDITION AND ITS INTERPRETATION 


Using the method of the earlier paper* let us pursue considerably further 
than was there necessary the question of the Jacobi condition for the finite 
case. Consider the new problem of a minimum of D(u) under the boundary 
conditions u(0) =u(1) =0, the quadratic condition Kyo=1(1.8) and the two 
sets of linear conditions 


1 
(2.1) [ Wade bs 
0 


The addition of the second set of (2.1) cannot decrease the minimum; that 
it is not increased is readily seen by noting that the function furnishing the 
minimum for the first set (2.1) only is U,, and that this function also satisfies 
the second set. That the minimum is \,, furnished by U,, may also be proved 
in a manner analogous to Theorem IV of §3. In other words the second set 
of conditions (2.1) affects the problem only formally. The Euler equation 
takes the form 


m—1 l 
(2.2) (pu’)’ + qu+dku + >) = 0, 
1 


e+1 


with the (m+/—s+1)-parameter family of solutions through the origin 


* Loc. cit., Mathematische Annalen, vol. 68, p. 289. 


§ 

‘ 


170 R. G. D. RICHARDSON [January 


U; 


m—1 
u = au,(x,r) — 


where au;(x, \) is defined as in (1.3). By a method similar to that used in 
§1, the plane family of extremals (2.3) may be replaced by an (m+/—s+1)- 
parameter (a\y;)-family in (m+/—s+2)-dimensional («w,0;)-space 
U; \? 
(2.4) 
0 


t 


and passing through the origin (0, 0,--- ,0). The summation over i, here 
as hereafter, is supposed to extend through the range 1,---,m-—1, 
s+1,---,l. It may be shown as in §1 that in this family (2.4) there is 
imbedded the space curve corresponding to the minimizing extremal 
u=U,,(x,m) and for which y;=0,A=XA,. Geometrically interpreted, the 
Jacobi condition demands that within the x-interval 0, 1 this space curve 
be not cut by any of its neighbors. This is equivalent to saying that the 
m+1I—s-+1 homogeneous equations in as many unknowns 


Ou;(x, Am U; 
al +. Sax | =) = 0, 


Ou, 2abu; 7 
(2.5) f ku,;—-dx + data — J ku,U dx = 0, 
0 0 Am — 


aan f + ku U jdx — ) favs = 0 
0 Or 0 \Am — As/ Jo 


have no solution for 0<x<1. The value of x next after x=0 for which 
these equations hold is called the conjugate point in the extended sense. 
And the Jacobi condition demands that this conjugate point lie beyond the 
point x=1. Now an infinite set Un,---, Us, - of characteristic 
solutions of (1.1) satisfy all the others of the set of sufficient conditions; hence 
it is the Jacobi condition alone which selects U» as the minimum. It will 
later be shown that the condition implies that “, vanish m—1 times in the 
interval, thus identifying it with U,, except for a constant multiplier. 

On the analytic side, the Jacobi condition concerns the sign of the 
second variation. For the purpose of calculating the second variation, we 
may take the integral in the form 


A CALCULUS OF VARIATIONS PROBLEM 


1 
f [ pu’? — qu? + rAm(vd — ku®) + — kU yu) 
0 


The admissible variation 7 is subject to the restrictions »(0)=7(1)=0 and 
1 1 
(2.6) ff tundx = 0, = 06 =1, 
0 0 
After the usual computation we find 
1 
(2.7) = f (on — gn? — Amkn?)dx 
0 


which by integration by parts and addition of multiples of the linear terms 
(2.6) becomes 


1 
(2.8) ef — n[(pn’)’ + qn + Amkn + + 
0 


Consider the expression inside the brackets of this integrand; it will vanish 
if for » we substitute the left hand side of the first line of (2.5), as can be 
proved by substitution and use of (1.1) and (1.22) and remembering that 
pi=dy;. It follows that if x=1 is the point conjugate to x=0, the second 
variation 6°D may be made zero by giving to 7 this value. 

A similar argument may be applied to any interval 0,x,, where x, is the 
point conjugate to 0. If x; is within the interval 0, 1, and if 7 is an admissible 
variation over 0, x1, we may set »=0 in the interval x,, 1. In that case the 
conditions (2.6) still hold and the second variation may still be written in 
the form (2.8) and may be made to vanish by the same device. 

The original minimum problem for u can be put into essentially the same 
form as (2.7) (2.6) with a proper quadratic restriction. Hence the minimum 
for D(n)/Ko(n) is \m furnished by 7=U,,(x) which is an analytic function. 
Any variation which is zero in a part of the interval cannot be analytic and 
hence cannot furnish the minimum for (2.7). In that case 6?D can be made 
negative, which indicates that the point conjugate to x =0 cannot lie within 
the interval. 

For the sake of definiteness, let us choose m=2, s=3, 1=4 and proceed 
to set up in detail the Jacobi condition. The Jacobi determinant of (2.5), 
apart from a constant factor, is 


1928] 171 


R. G. D. RICHARDSON 


uy U, 


0 0 0 

f f kU f kU 
0 0 0 

f kU f xUedx 
0 0 0 


and for a minimum the Jacobi condition asserts that this can have no zero 
within the interval. It vanishes at x=0, but not at x=1 since at that point 
its value is 0u,/0\ and from (1.24) and (1.6) it is evident that not both u 
and du/d\ can vanish at x=1. 

Add to the conditions of this special problem the further one 


1 
f kUsudx = 0. 
0 


The solution is still U2, but in place of D(x, d) there is a five-rowed de- 
terminant D,;,(x, \), which is obtained by inserting between the third and 
fourth rows of (2.5) a new row similar to these except that U; replaces U; or 
U, and between the third and fourth column a new column in similar fashion. 
Schematically this new determinant, which must not vanish within the 
interval, may be expressed as follows: 


U; 


(2.10) Disa(x,d) = 


Before proceeding further with the main argument, let us prove a funda- 
mental lemma, the compact form of the proof of which is due to my colleague, 
Professor H. P. Manning. 

Lemma. Given two determinants, Dn and Dmns1, of the mth and (m+1)th 
orders, respectively, the first being a first minor of the second, 


Wi 


172 [January 
(2.9) 
Or 
Ou, 
f —— 
Or 
Dia(x,d) 
Ou, 
0 Or 
Ou, 
dx 
0 Or 
Ou, 
Or 
a2 a23 25 
| G32 433 a34 |- 
a4 a2 45 
as 52 a54 255 
Wm-1 | | Wi . . . Wm 
am-1 Qm-1,1 °° * Gm 


1928] A CALCULUS OF VARIATIONS PROBLEM 


and the terms being subject to the following conditions: 
= wwf — ww; = +)ai, — w/w = (1; — 


and such that the derivatives of the terms of any row other than the first form 
multiples of the first. Denoting by a, a, +--+ , &m—1 the cofactors of the first row 
of Dn. and by A, Ai,- ++ , Am those of the first row of Dn and for completeness 
of notations setting &m=0, then 


d Dm 


From the theory of determinants the following identities may be written 
down: 


(2.11) =0 =1,---, m—1); amat = Am; 


t=1 t=1 
(2.12) + = 0 (Gj=1,--+-,m). 

t=1 
Since by hypothesis the determinants formed by replacing any row except 
the first by its derivative are zero, the numerator of the derivative of the 
quotient ot the determinants can be written in the form of a single de- 
terminant of the second order 

wa + + +++ + Wndm wa+ Om 


(2.13) 
wA + wAit:::+ + wn Am 


In expanding (2.13) the terms involving either w or w’ may be written 


>> (ww/ —w’w,)(aA;—Aa,). By hypothesis this becomes 
f(x) — + — Aas) 
= f(x)[e(aiA + — A(aia + + — 
which by (2.11) and (2.12) becomes 
(2.14) — 


The other terms of the expansion of (2.13) are >> ;;(waw/ —w/w,)(aiA ;—a;A)) 
(j>i) which by hypothesis may be written 


Dll; — — a;A)) G > 4) 


= DIA; Daas — Dla; >> Avai; (i and j independent). 
7 i i i 


173 
D2 
$7 


174 R. G. D. RICHARDSON [January 


Adding this last expression to (2.14) we have for the numerator of the de- 
terminant 


ao; + Dawei) (40; + 


By (2.11) and (2.12) this reduces to 1nAwm*. The derivative is thus 
m?/Dm21 and has the sign of J. 

Further it may be noted that if from D,, we pick out another minor 
E,: by leaving out any row (the sth) except the first and any column (the #th) 
except the first, the same argument holds and we find that 


The argument may also be applied in a formal fashion when m is infinite. 

Returning now to the main discussion it is possible to write down the 
derivative with regard to x of the quotient of (2.10) and its first minor (2.9). 
As may be seen from (1.23), (1.24), (1.25), the conditions of the lemma are 
satisfied by the determinants (2.9), (2.10). 

Hence we have 


d D Ae — As 
— =—__“a#? <0, 
dx 


(2.15) 


where a; is the cofactor of U; in (2.10). 
Now Djs, and D,, vanish at x=0, do not vanish in the interval, and have 


at x =1 the same value 
O0u;(x, d) 
On 


which is positive as may be noted from (1.24), since U.(1)=0 and U/ >0, 
this being the second zero beyond x =0 for this function. 

The formula (2.15) indicates that the roots of D,s, and Dy, separate each 
other and since it may be shown as in §7 of the paper cited that at x=0 the 
determinant D, has the higher order of zero, the quotient Di,/D,s, starts 
at x =0 with a value + and having a value 1 at x=1, vanishes at the first 
zero of D,, which must lie before that of Dy3,. 

If D, denotes the determinant obtained by omitting the last row and 
column from (2.9), the same argument shows that 

— 
_-—s= — (function)? < 0. 
dx Dis p(x) 


Dm AAs 
ik 


1928] A CALCULUS OF VARIATIONS PROBLEM 175 


D, vanishes at x =0 of lower order than D,, and has at x =1 the same value; 
the next root of D,; must then lie before that of Dy. 

In descending one step further in' the order of the determinant, the 
argument is somewhat different and coincides with that of the earlier paper. 
The forma! process of finding by means of the Lemma the derivative of 
the quotient of two determinants is the same but since in all cases the sign 
of the result depends on \2—\;, the derivative is negative when any condition 
of the second set of (2.1) is involved and positive when all of these are 
omitted. The ratio of the determinants at x=0 is + in the first case and 
—oo in the other. 

When the last row and column of D, are omitted and the remaining 
two-rowed determinant denoted by D, it was shown in the earlier paper* 
(and also follows from the discussion here) that 


— (function)? > 0, 
dx p(x) 
and further that 
d 1 


— (function)? > 0, 
dx p(x) 


and from these facts that u, has precisely one zero between x=0 and x=1. 


The Jacobi condition is thus vital in the final handling of this calculus of 
variations problem. Its place among the necessary conditions and among the 
sufficient conditions is a fundamentally important one. 

The determinant D,, is obtained from D,3, by omitting the fourth row and 
fourth column; if a different four-rowed minor be selected from Dixy by 
omitting any row except the first and any column, a formula for the derivative 
of the quotient of it by D,3; may be obtained in the same manner.f If the 
minor selected be symmetrically placed with regard to the main diagonal, 
the derivative will involve the square of a minor as in (2.8); if it is not 
symmetrically placed this square is replaced by the product of two different 
minors. This process may be repeated step by step until one arrives at 1. 
The ratio of Dis; (or of “) to any minor of any order symmetrical to the dia- 
gonal is a function of x monotone in the interval 0, 1 and one may descend 
from Dy3, to “ by ladders different from that used above; but in each case 
the argument determines the exact number of zeros of 1. 


* Loc. cit., Mathematische Annalen, vol 68, p. 269. 
+ Cf. the sequel of the lemma. 


| 


176 R. G. D. RICHARDSON [January 


Returning to the other end of the series of determinants, if an extra 
condition is imposed on ™ so as to give a six-rowed determinant Dysi, 
including D,4 as a first minor, we have 


d 
dx Disa 


= (As — Az)(function)? < 0. 


For the various functions, the first zeros beyond x =0 lie in the following order 
from left to right: #1, D, Di, Dis, Diss, Disar. 

For the general problem with conditions (2.1) the essential facts may be 
formulated in a fashion similar to that of the special case selected. If the 
second set is deleted, the determinant D,...m-1 has no zero within 0, 1 
while .. . m-2 has one, D,... has two and has m—1. But the addition 
of any group of one or more (and in any order) of the second set (which is 
more or less supernumary to the problem) gives a determinant with no zero 
within the interval. The imposition of another condition moves further to 
the right the zero of the determinant, and this continues step by step until 
as many conditions are imposed as is desired. It is striking that determinants 
of integrals of any desired order and with no zero in the interval 0, 1 can be 
built up in this simple fashion. It is also noteworthy that the ratio of the 
determinant or of any minor symmetrically placed with regard to the main 
diagonal to any other minor contained in it and also symmetrically placed is 
a function of x monotone in the interval, provided only that the latter 
contains the term ™. 

It may further be remarked that the above discussions apply not only 
when there are two groups of linear conditions K;=0, each with consecutive 
subscripts, but also when these conditions are taken at random. The 
minimum is furnished by U, where p is the smallest integer not included 
among the i’s; the Jacobi condition admits of interpretation as in the 
case discussed. It is also immediately evident that a minimum would exist 
if i ran over some sequence not including all the integers but with infinity 
as a limit. The argument of this section paves the way for the extension 
of the theory to the infinite case. 


3. THE EXISTENCE OF EXTREMA 


In §1 a single sequence of functions Ui, U2, - - - was defined in the ortho- 
gonal case and a double sequence U;, U2, - - - ; - - - , U_2, U_; was defined in 
the polar case as solutions of the difierential equation 


(3.1) L(u) = (pu’)’ + qu+ rAku = 0 


1928] A CALCULUS OF VARIATIONS PROBLEM 


under the boundary conditions 
(3.2) u(0) = u(1) = 0. 


The theorems of the present section concerning these functions fall into two 
groups according as the orthogonal (k(x) one sign) or the polar case (k(x) 
both signs) is considered. For the polar case it is possible (by the addition of 
an infinity of linear conditions imposed on the functions U_,) to establish 
results in nature similar to those of Theorem I; but the principles involved 
are sufficiently illustrated by the less complicated formulation here given. 
For the sake of simplicity in the polar case a further hypothesis is made 
that all the characteristic solutions are real; this will be the case, for example, 
if g(x) <0. 

The relative extrema here discussed concern three integrals the relations 
of each of which to the differential equation (3.1) have been discussed in 
the Introduction. These are 


1 1 
D(u) = f (pu’? — qu*)dx, p>O; Ko(u) = f ku*dz, 
0 


R(u) = f f 


where G(x, &) is the Green’s function of the differential expression (pu’)’+qu 
with boundary conditions (3.2). In discussing the last integral we restrict 
ourselves to the case g<0 in order that R(u) be positive. For each couple 
of these three integrals it is possible to prove a pair of theorems concerning 
extrema. 

The integrals D(u), Ko(u), R(u) can be approximated as closely as we 
please by the corresponding integrals in which u(x) possesses an absolutely 
continuous first derivative. Hence there is no loss of generality in restricting 
ourselves to the consideration of such functions wu. 

Theorems I-III concern the orthogonal case and IV the polar case. 


THEOREM I. Among all continuous functions u(x) which give to the integral 
D(u) a meaning and which are subject to the condition Ky=1, the boundary 
conditions (3.2), and the infinity of linear conditions 


1 
(3.3) = 0 (¢@=1,---,m—1; s+1, s+2,---), 
0 


the maximum value d, of D(u) is furnished by U, and the minimum value Xm 
is furnished by Un. 


- 
177 

| 


178 R. G. D. RICHARDSON [January 


For, in the orthogonal case formulas analogous to (1.20), (1.21) have the 
simpler form 


Ko(u) = D(u) = ric? 
1 1 


and the hypotheses (3.3) reduce the problem to the consideration of relative 
extrema for quadratic forms in a finite number of variables only, 


(3.4) D(u) = =extremum, Doc? = 1. 
t=m t=m 
Since \, is the largest of the characteristic numbers here appearing, and Aw 
the smallest, the theorem is immediately established. 
A consideration of the proof of Theorem I indicates that the reciprocal 
theorem can be at once deduced. 


THEOREM Ia. Under the boundary and linear conditions of Theorem 1 and 
q(x) £0 the minimum of K o(u) for those values of u which make D(u) =1is 1/r. 
furnished by U, and the maximum is 1/» furnished by U m. 


THEOREM II. Among all continuous functions u(x) the integral R(u), 
under the conditions g=0, Ky=1 and (3.2), (3.3), possesses a maximum value 
1/Am furnished by U, and a minimum value 1/d, furnished by U,. 


For, on setting h(x) =k(x) u(x) the formula (1.19) becomes 


( Uda)dz) 


and from (1.20) and the hypotheses, this may be written 
8 c? 
(3.5) R(u) = >> 


This with the second formula of (3.4), valid here also, is sufficient to establish 
the theorem. 


THEOREM Ila. Under the boundary and linear conditions of Theorem II 
the maximum of Ky for those values of u which make R(u) =1 is d, furnished by 
U, and the minimum is d,, furnished by U m. 


A consideration of the preceding theorems and of (0.20) suggests another 
theorem which, with its reciprocal, may be readily proved by means of 
(3.4) and (3.5): 


a 


1928) A CALCULUS OF VARIATIONS PROBLEM 179 


THEOREM III. Among all functions u(x) which give D(u) a meaning and 
are subject to the conditions that R(u) =1, q(x) <0, and (3.2), (3.3), the integral 
D(u) possesses a minimum d,. furnished by Um and a maximum 2 furnished 
by U,. 


THEOREM IIIa. Under the boundary and linear conditions of Theorem III 
the maximum of the integral R(u) subject to the condition D(u)=1 is 1/A,2 
furnished by U,, and the minimum is 1/d? furnished by U,. 


In the polar case the situation allows only one extremum and the re- 
ciprocal theorem will have only one. 


THEOREM IV. Among all continuous functions u(x) which give D(u) 
a meaning and are subject to the conditions Ko=1, (3.2) and (3.3), the integral 
D(u) possesses a minimum dm furnished by U», while the maximum is infinite. 


For as in Theorem I, by means of (1.20) and (1.21) and the hypothesis, 
the problem is reduced to relative extrema of quadratic forms with an 
infinite number of variables 


s | —1 
(3.6) D(u) = Dc? = extremum; Kp = Ye? die? = i. 


On multiplication of the second of these by \, and subtraction from the first, 
there results 


D(u) — rm = Leck (Xs — Am) + Lick Am — 
m+1 
and since all the coefficients of c? are positive, it is seen that the minimum 
is given by ¢m=1, (i= ---, —2, —1; m+1,---, 5). On the other 
hand for c,=2'/?, c_,=1, and the other c’s zero the formulas (3.6) give 
D(u) the value 2,, —A_, and this may be made as great as is desired by taking 
n large enough. 


THEOREM IVa. Under the boundary and linear conditions of Theorem IV 
and provided g<0, the maximum 1/.» of the integral Ko for those values of u 
which make D(u) =1 is furnished by Un. 


For, on multiplication of the second of the expressions 


8 
Ky = Die? = max., D(u) = >» re? = 1 


by 1/\,, and subtraction from the first, it follows that 


| 
4 

m m 


R. G. D. RICHARDSON 


Am m+1 


and since all the coefficients of c? are negative the theorem follows at once. 

If in Theorem IV, instead of setting Ko equal to lit is equated to —1 and 
if in (3.3) the U; are replaced by U_,; the minimum is —A,, furnished by U_,. 
There is a corresponding recipivcal theorem. 


4. GENERALIZATION OF THE EXTREMUM PROBLEM. 
THE EULER EQUATION AND ITS SOLUTIONS 


If we generalize the problem (1.7), (1.8), (2.1) by seeking the minimum 
or maximum of 


(4.1) D(u) = — gu*)dz ; p>0, u(0) = u(1) = 0, 
under the quadratic condition 
(4.2) Ky= = 
0 
and the infinite number of linear conditions 


1 
(4.3) kU = 0 


the Lagrange method suggests the consideration of the absolute minimum 
of the integral 


1 m—1 
(4.4) f [ pu’? — qu? + — ku®) + — kU im) 
0 1 


+ >o2u(of — kU su) |dx, 


e+1 


where after the analogy of (1.9) and (1.12) for the finite case, the v’s are de- 
fined as follows: 


(4.5) w= kutdx, 4 = 
0 0 
which may also be written 


(4.6) — ku? =0, = 0, =1; 
vi kUyu = 0, v,;(0) ed v,(1) = 0. 


180 [January 
5 


1928] A CALCULUS OF VARIATIONS PROBLEM 


It is natural to expect that the Euler equation will have a form 
m—1 bad 
(4.7) (pu’)’ + qut+rku + + = 0 
1 s+1 
generalized from (2.2) and the solutions 


(4.8) 
. = am(x,r) — 
Sim 


uiU; 


will be a generalized form of (2.3). 

That the solutions (4.8) actually satisfy the equation (4.7) may be proved 
by direct substitution. To indicate the line of argument for deriving the Euler 
equation (4.7), we proceed formally and assume that u(x) gives an extremum 
and set up admissible variations after the usual method. If the fundamental 
set of functions on which the variations are to be linearly dependent are 
chosen at random, the number of them must ordinarily be infinite. For, 
the family 


(4.9) = m+ Diem), = = 0 


is subject to a quadratic and an infinity of linear conditions and the e’s 
must be chosen to satisfy them. If, however, it be noted that the linear 
conditions (4.3) are satisfied by any one of the functions* Un,---, U,, 
or any linear combination of them, the problem is reduced to a much simpler 
one. For example, 


Y(x,e) = (1 — U, + (2e — m<l<s, 
satisfies not only the linear conditions but also the quadratic (4.2). 


In the general case it is easily seen that the set (4.9) must satisfy the 
relations 


1 1 
Ko = f uy + = Ky = f kU; = 0, 
0 i 9 i 


and give to 
= f [p(m + — + Demi(x)?P dx 


an extremal value for e¢;=0. It is then necessary that, for the values 


* In the polar case these functions are + « + U_3, U1, Um, ++ *,U,. The argument of this section 
is in general valid for that case also. 


§ 
181 
| 


R. G. D. RICHARDSON [January 


aD 
—d;=0, —de; =0, de; = 0 
a 


1 €j €j €j 


-,m—1, s+2,---); 
hence whatever the multipliers \, 4; may be, it follows that 
~ 
(4.10) + 0 
Oe; 2 0€; 


where 
1 


s+1 


Let A, u, be determined by the equations 
aM aD 0K, OK, 
0€; 1 0 
which is possible provided the determinant of the coefficients of \ and p; 


0€; €; 


1 1 1 1 
f kuynodx f kU ynodx f kU m—1n2dx f RU 
0 0 0 0 
1 1 1 1 
f kuyn3dx f kU yn3dx ees f kU m—1n3dx f RU 
0 0 0 0 


is different from zero. This may be ensured by proper choice of the 7’s; 
for example, diagonal terms may be made unity and all the other terms zero. 

The values of X, uw; so chosen are independent of m. Hence from the 
formula 0M /d«,=0 derived by subtracting the infinite set of equations (4.11) 
from (4.10), the Euler equation (4.7) may be at once derived in the usual 
way. 

Let us return to a discussion of the solutions (4.8) which may be regarded 
as an infinity-parameter set of plane extremals through the origin. Since 
the U; vanish at x =1, in order that u vanish at that point also, it is necessary 
that (1, A)=0. For the minimizing or maximizing extremal of the family 
it may be shown that y;=0 by the method used in deriving (1.15); in other 
words the extremum is a solution of the homogeneous equation (1.1). The 
function ~ is then a solution of the homogeneous system (1.1), (1.2) and is 
orthogonal to U; unless it is a multiple of it. So far as we ascertain from 
the Euler equation, any one of the functions U,, - - - , U, corresponding to 
the characteristic numbers Am, - ++, A, might serve as a solution. One 
of these must give the minimum and one of them the maximum. 


1928] A CALCULUS OF VARIATIONS PROBLEM 183 


To round out the discussion and prepare for the treatment of the Jacobi 
condition, the problem of extremals may be interpreted in infinity-dimension 
space x%%0v;. The Euler equations would in that case consist of (4.7), 
with the boundary conditions u(0) =u(1) =0, together with (4.6); the solu- 
tions constituting the infinity-parameter family of extremals through the 
origin would have a form generalized from (2.4) and would consist of (4.5) 
and (4.8). 

In dealing with this family of extremals passing through the origin, 
it is natural to consider only those functions for which ya ku*dx is finite; 
an application of this condition to (4.8) shows that this limitation is equiva- 
lent to supposing that }-[u;/(A—A,) |? is limited. 


5. THE SECOND VARIATION 


Despite the introduction of newer methods for the simple problem 
without auxiliary conditions, the method of second variation still remains 
standard for isoperimetric problems. It is then natural after the discussion 
of the Euler equation to proceed to the discussion of 6?D. For the admissible 
variations 7= > ¢; 7, set up in (4.9) it is a necessary condition that, according 
as a minimum or maximum is sought, 6°-D20 or 6°D <0. Since by the nature 
of the hypotheses, the second variations 6*K; are 0, this may also be written 


m—1 
+ + = 0 or <0. 
1 


s+1 


On calculation from (4.4) it is found that 


1 
(5.1) f (pn’? — gn? — Xkn?) dx 
0 


where X is the characteristic number of the extremum solution and where 7 
is subject to the conditions 


1 1 
(5.2) kundx = 0, = 0 s4+1,--- 
0 0 


By integrating (5.1) by parts and adding multiples of the linear terms (5.2) 
the second variation may be written 


m—1 


1 
(5.3) &D= — ef n{(pn’)’ + an + Aken + + 
1 


0 


|dx. 


8+1 


184 R. G. D. RICHARDSON [January 


In the orthogonal case* for the problem (4.1), (4.2), (4.3) the second variation 
related to the minimizing function U, is positive and that related to the maxi- 
mizing function U, is negative. 

For, taking up first the problem of a maximum it is necessary that the 
integral 


(5.4) f — gy? — de 
0 


be negative for all 70 satisfying the continuity and boundary conditions 
and the linear conditions (5.2); it will also satisfy a quadratic condition such 
as 


1 
(5.5) ff =o 0. 
0 


The problem may be regarded as that of finding a maximum zero of (5.4) for 
those functions n(x) #0 which satisfy (5.2) and (5.5). As Bliss has pointed 
out in similar problems, the original problem for the integral D(u) may 
itself be put into precisely this form and the results there obtained applied 
here. The admissible variation 7 must be linearly dependent on Un, --- , 
U,; that is,n= >;-,, @; U,;0n calculation it turns out in a manner analogous 
to Theorem I of §3 that 


= — da? SO. 


That 62D is actually negative may be seen by noting that it could be zero 
only ifa,= --- =d,1=0;since >a? <0, it follows that a,~0. But U, can- 
not be an admissible variation for U, itself since the value of Ko would be 
affected; hence 6°D <0. 

A similar argument shows that in the problem of a minimum the ad- 
missible variations of the functions U,, make 6?D positive. 

From analogy with the Legendre condition for the finite problem, we 
would expect, in order that a maximum exist, that H,,=2p (where H is 
the integrand of (4.4)) must be negative while for a minimum this same 
function must be positive. But here we have found a maximum for p>0 
in striking contradiction to the theorems for the finite problem. It is evident 
that there must be some underlying reason why one of these conditions 
and not the other is satisfied. As will be evident later, an investigation of 


* For the minimum problem in the polar case (Theorem 4, § 3), it follows in similar fashion that 


8 
m 
m 


1928] A CALCULUS OF VARIATIONS PROBLEM 185 


the Jacobi condition for the problem is fundamental before any appeal 
can be made to the Legendre condition. 

It may be noted that al! the admissible variations of the maximum prob- 
lem are contained in the family of extremals of the minimum problem, while 
a part only of the admissible variations for the minimum problem are 
contained among the extremals of the maximum problem. 


6. ‘THE JACOBI CONDITION FOR THE INFINITE PROBLEM 


In (1.3) there was set up a two-parameter family awu;(x, X) of solutions 
of the homogeneous equation (1.1) and in (4.8) an infinity-parameter family 


U; 

of plane extremals which are solutions of the Euler equation (4.7) and which 
pass through the origin, the parameters y; being restricted to those values 
which make > [u:/(A—\,) ]* finite. By means of the auxiliary variables 
Vo(x), 01(x), Vm—1(X); as defined in (4.6), extremals were also 
set up in space of infinity dimensions xuvv;. To every extremal (6.1) of 
the xu space corresponds an extremal in the higher space. Among the 
questions which present themselves is that concerning the existence of a 
field in the neighborhood of the minimizing extremal in infinity dimensions. 
Does there exist a region about this curve through each point of which there 
passes a unique extremal of the family in infinity-dimensional space? In 
other words, do there exist constants a, A, uw; such that for these values an 
extremal (6.1), (4.5) passes through the origin and any other designated point? 
Is there a one-to-one correspondence between the xuvov; space and the 
ady; space? Or, on the contrary, is one extremal cut by a neighboring one 
before the end of the interval 0, 1 is reached: that is, is the point conjugate 
to x =0 in the extended sense within the interval? The condition for a con- 
jugate point has been developed in §2 at considerable length for the finite 
problem and it is not necessary in extending it formally to the infinite prob- 
lem that great detail be given. 

For a conjugate point an infinity of conditions corresponding to (2.5) 
must be satisfied: 

U; 


Ou, 
+ bau; — = 0 
ad 1 > 


= bus 
(6.2) 2a75r f + 2ata ku?dx — 2a uU dx = 0, 
0 Or 0 


é 


. . . . . . . . . 


186 R. G. D. RICHARDSON [January 


This leads to a consideration of the infinite determinant: 
(6.3) 


Ou, 
or 
z Ou uy z z 
f kuy—-—dx f ku?dx f RU m—1u\dx f RU 
Or 0 0 


0 
0 0 


0 


Denoting by Fms2(x), the principal (m+1)th, (m+2)th, 

- order minors in its upper left-hand corner, it may easily be shown that 
the determinant (6.3), regarded as the limit of these minors, is a bounded 
function of x, being 0 at x=0 and 1 at x=1. From the previous paper* 
we have the theorems that each of the functions Fm4i(x), Fms2(x), - are 
0 at x=0 and positive elsewhere, being 1 at x=1, and that the quotient 
Fnip/F m is, for all ~, a monotone function ranging from 0 at x=0 to 1 at 
x=1. A passage to the limit gives a bounded function (6.3). 

The formal analogon of the Jacobi condition may then be stated as 
follows: In order that there be no conjugate point in the interval, the infinite 
determinant (6.3) must have no zero other than x =0 in the interval. 

To give a formal indication of the necessity of this condition let us assume 
that the equations (6.2) are satisfied for a point x; within the interval and 
prove that this involves the vanishing of 5°D. We have seen (5.3) that the 
second variation may be written 


1 
(6.4) ef + qn + Amkn + addku + 
0 


Since for the minimizing extremal \=X,, and it follows that =dy,, 
A—A,,=5A; the expression in brackets in the integrand of (6.4) has then 
the form 


(6.5) (pn’)’ + gn + Amkn + + 


It is readily shown that the substitution of the expression on the left of the 
first equation of (6.2) will make (6.5) zero. If then in the interval 0, x we 


* Loc. cit., Mathematische Annalen, vol. 68, p. 289. 


| | 

| | 


1928] A CALCULUS OF VARIATIONS PROBLEM 187 


choose for the variation 7 this expression in (6.2) and in the sub-interval 
1 set 7=0, the second variation vanishes. 

That 6*D can actually in that case be made negative can be shown by 
the following argument. Referring to the discussion of the second variation in 
§5 it may be noted that were 7 to furnish the minimum for the integral (5.4) 
under the conditions (5.2), (5.5), thus making the Euler equation of this 
subsidiary problem the same as that of the original extremum problem, 
the solution would have all its derivatives continuous at x1, which is obviously 
not the case here. Hence the variation 7 chosen above does not givea 
minimum to the second variation and 6?D can be made negative. This would 
indicate that for a minimum the point conjugate to x =0 cannot be within 
the interval; and it indicates also that there must be a conjugate point in 
the interval if there is to be a maximum. 

To consider the relation between the infinite determinants (6.3) for var- 
ious values of m and s let us denote by D,, the infinite determinant obtained 
by setting m=1 and s=3 and by D,y, that obtained by setting m=1, s=2. 
The latter contains one more row and column than the former and as is 
indicated in the Lemma in §2, we have the formula 


d Du — As 2 


= a14, 


dx Dy34 pD2 


134 


where a, is a certain first minor of D,3;. In other words the discussion parallels 
exactly that of §2 except that instead of a finite number of terms there is 
an infinite number. Each of the infinite determinants obtained by dropping 
out any finite number of columns and the corresponding rows (taken in 
order or scattered here and there throughout the determinant) can have no 
zero within the interval. By dropping out any column and corresponding 
row the zero of the determinant moves to the left. Since the ratio of any 
determinant to that of order lower by one is monotone in the interval 0, 1 
the same will be true concerning the ratio of any two in the scale provided 
the one is contained in the other. 


7. HAMILTON FUNCTION. HILBERT INTEGRAL. WEIERSTRASS CONDITION 


Assuming that the Jacobi condition is satisfied in the interval 0, 1, 
consider a point in the infinity-dimensional field about the maximizing or 
minimizing extremal. Through the origin and this point whose abscissa is 
x there will be an extremal of the family 


m—1 U; 
(7.1) u=an(x,r) — = 
1 


| 
s+1 Ni 


188 R. G. D. RICHARDSON 


(7.2) % = f kutdx, 4= f kUywudzx. 
0 0 
The Hamilton function is defined to be the integral 
(7.3) W(x,u,0, V1, °** » Um—1, *** ) f (pu’? qu*)dx 
0 


taken along this extremal. Since along this curve the relations (7.2) are satis- 
fied, the integral may also be written 


z m—1 
W = f [ pu’? — qu? + — ku?) + — 
0 1 


+ — ku 
8+1 
By the method usual in such cases* the derivatives may be calculated 
formally as follows: 
ow 
— = pe? — qu? — o(2p9) — — 
1 


Ox s+1 


ow 

— = 2p; 

Ou 
where ¢ is the slope of the projection on the xu plane of the space extremal 
through the given point and 9; the slope of the projection of the space ex- 
tremal on the xv; plane. Because of the linear character of the conditions, 
it follows from the definitions that both g; and 2,’ are equal to the value of 
kuU, at the point in question and hence are equal to one another. The 
differential dW may then be written 


m—1 bes 
dW = | - pe? — qu?— — + 2pedu 
1 
m—1 
+ = — (pe? + qu®)dx + 2pedu. 


1 a+1 


Because this is a perfect differential its integral 


(7.4) f (— pe? — qu? + 2peu')dx 
0 


is independent of the path and is the analogon of the Hilbert independent 


integral for this problem. 


* Bolza, p. 599. 


| [January 
| 


1928] A CALCULUS OF VARIATIONS PROBLEM 189 


To set up the Weierstrass formula let us compare the value of D(u) 
taken for the interval 0, 1 along a curve C of admissible variation, which 
must satisfy the equations (7.2), with its value along the minimizing extremal. 
The integral (7.4) taken throughout the interval along the minimizing ex- 
tremal is the Hamilton function and represents the minimum. Its value 
along the admissible variation is the same. Hence 


Ay = f [(pu’? — qu*) + (pe? + qu? — 2peu’)|dz 
Cc 


and on setting E(x, u, u’, ) =p(u’ — ¢)? this may be written 
AJ = f E(x,u,u’,e)dx. 
c 


The conditions E(x, u, u’, g) 20, E(x, u, u’, ¢) <0 would be the analogons 
of the Weierstrass conditions for minimum and maximum respectively in the 
finite problem. 


Here E(x, u, u’, g)=0 for both minimum and maximum, and the sig- 
nificance of this condition has entirely disappeared. 


Brown UNIVERSITY, 
PROVIDENCE, R. I. 


A CONTRIBUTION TO THE THEORY OF 
FUNDAMENTAL TRANSFORMATIONS 
OF SURFACES* 


BY 
M. M. SLOTNICK 


INTRODUCTION 


Two surfaces are said to be related to one another by a fundamental 
transformation, that is, by a transformation F, if the developables of the 
congruence of lines joining corresponding points on the surfaces cut the sur- 
faces in conjugate nets of curves. It is assumed that neither of these nets is 
a focal net of the congruence. The nets on the surfaces are also said to 
correspond by the transformation F. 

Although many well known transformations of surfaces are special 
types of transformations F, the general case was treated in detail but 
recently, by Eisenhartt and Jonas.{ In a recent paper Graustein§ introduced 
into the study of these transformations a projective invariant which was the 
generalization of the invariant of a parallel map.|| Certain important 
theorems concerning this invariant were obtained whose nature indicates 
that transformations F can be investigated to advantage by means of it. 
We call this invariant the invariant C. 

When studied in terms of tangential codrdinates, transformations F 
present a complete duality among the elements involved. In this way, 
a second invariant, the invariant H, is obtained which is dual to the in- 
variant C. The invariant C is equal to the cross ratio in which a pair of 
corresponding points of the surfaces in the relation F is divided by the 
focal points of the line joining them. Dually, the invariant H is equal to the 
cross ratio in which a pair of corresponding tangent planes to the two 
surfaces is divided by the focal planes through their line of intersection. 


* Presented to the Society, October 29, 1927; received by the editors June 11, 1927. 

t Cf. Eisenhart’s treatise, Transformations of Surfaces, Princeton, 1923, which deals primarily 
with these transformations. We shall follow the notation employed in this book, and shall refer to 
it as Eisenhart, T. S. 

t Jonas, Sitzungsberichte, Berliner Mathematische Gesellschaft, vol. 14 (1915), pp. 103 ff. 

§ W. C. Graustein, An invariant of a general transformation of surfaces, Bulletin of the American 
Mathematical Society, vol. 32 (1926), p. 357 ff. 

|| W. C. Graustein, Parallel maps of surfaces, these Transactions, vol. 23 (1922), pp. 298-332. 


190 


TRANSFORMATIONS OF SURFACES 191 


It is the purpose of this paper to make a study of transformations F 
based upon these invariants. In fact, the invariants C and H form a tool 
by means of which many theorems ate found which do not easily lend 
themselves to proof by the classical methods. Fundamental existence ques- 
tions which arise concerning the conditions on the invariants C and H and 
on the nets in the transformation F are readily answered. The relations 
between the invariants C and H and the surfaces in the transformation also 
yield interesting consequences. 

The invariant C is introduced in Part I, which also contains a fundamental 
theorem for transformations F of a given net having a given invariant C. 
The analogous work for the invariant H is done in Part II. The invariants 
C and H of a transformation F which is the product of two such transforma- 
tions are also discussed in these two parts. 

Transformations F and nets of special type are discussed in Part ITI. 
The last part, Part IV, is devoted to the application of some of the results 
obtained to transformations of Ribaucour. 


I. THE INVARIANT C OF A TRANSFORMATION F 


1. Fundamental equations. A congruence of lines G and a net NW are 
said to be conjugate to one another if the curves of NV, which is assumed not 
a focal net of G, lie on the developables of G. Two nets N and N, are then 
related to one another by a transformation F if the congruence G of lines 
joining corresponding points of these nets is conjugate to both nets. The 
congruence G is known as the conjugate congruence of the transformation F. 

Consider a surface S:x=x(u, v)* on which the parametric curves form 
a net N, which has for its point equation 


070 dloga 00 b 00 
Ov ou Ou ov 


(1.1) 
To obtain an F transform of N we have first to find a congruence G conjugate 
to N, and then a net N; conjugate to G. A net N’, parallel to V, and traced 
by the point x’, where 


(1.2) 


determines G in that the point coérdinates of N’ serve as direction parameters 


* Le., (u, v), =1, 
Tt Eisenhart, T. S., § 2. 


| 
Ox’ Ox Ox’ 
Ou au. Ov dv’ 


192 -M M. SLOTNICK [January 


for the lines of G. A solution 6 of the point equation (1.1) of N will determine 
a net Ni, conjugate to G, whose point coérdinates are 


6 
(1.3) m= 


Here 6’ is a solution of the point equation of N’ corresponding to 9; i.e., it 
satisfies the equations 
06 00’ 06 


(1.4) 
Ou Ou Ov dv 


The net N; is said to be an F transform of N by means of the solution @ of its 
point equation and along the congruence G. 

The lines of intersection of corresponding tangent planes to the surfaces 
of N and N, also generate a congruence called the harmonic congruence of 
the transformation. For a line L of this congruence the focal points Fi, Fs 
have the coérdinates* 


(1.5) P 6 Ox P 6 Ox 

Ou Ov 


and hence are the intersections of L with the focal planes of the corresponding 
line of G. 

2. The invariant C.{ A transformation F of the net N into the net V; 
establishes a projective correspondence between the pencils of the tangent 
lines to the surfaces of these nets at corresponding points x and x. These 
pencils of tangent lines meet the line of intersection Z of their planes (the 
tangent planes to the surfaces of N and N, at x and x, respectively) in pro- 
jective ranges of points. In this projectivity the fixed points are the focal 
points F, and F;. If D and D, are a pair of corresponding points of the two 
ranges on L, the invariant of the projectivity is 


(2.1) C = (DD,, 


The function C is a projective invariant of the transformation F which we 
shall call the conjugate invariant, or, briefly, the invariant C. 
The invariant C has another geometric significance.{ It is the cross ratio 


* Eisenhart, T. S., § 17. 
t W. C. Graustein, An invariant of a general transformation of surfaces, § 5. 
t Ibid., § 3. 


1928) TRANSFORMATIONS OF SURFACES 193 


in which the points x and x, of the nets N and JN, are divided by the focal 
points z and y of the line of G; i.e., 


(2.2) C = (xx, zy). 


For the transformation F discussed in §1, the invariant C is found to be 


t 
(2.3) C=—,* 
where 
(2.4) i= 


Two nets are said to be radial transforms} of one another when the lines 
joining corresponding points are concurrent. We agree to admit radial 
transformations into the category of transformations F, and point out that 
C =1 is characteristic of them. 

Finally, we note that the invariant C of the inverse of a transformation F 
is equal to the reciprocal of that of the original transformation. 

3. Fundamental theorem. Equations (2.3), (2.4), (1.1), and (1.2), 
combined with the condition of compatibility of the equations (1.2), yield 
the relations 


(3.1) —— = (1 
Ov C /dv a 
logs 


6 
(1 — C)—log—- 
Ou Ou b 

We now form the difference between the derivative of the first of these equa- 
tions with respect to u and that of the second with respect to v and obtain 
the equation 


(— — 1 log — | + (1 — Ch— log — | = 
Oudv duL\C Ov a dv Ou b 


as a condition on the invariant C and the solution @ of the point equation of 
the net N, for the transformation F. 

Suppose now that we have a net N with (1.1) as its point equation, of 
which @ is a given solution. Given also a function C(u, v) satisfying (3.2). 
The system of equations (3.1) combined with sC = is then compatible, 


* Ibid., § 5. 
t Eisenhart, T. S., § 14. 


194 M. M. SLOTNICK [January 


and by means of it two functions ¢ and s are defined to within the same multi- 
plicative constant. The function 


(3.3) ¢ = 


6 


is defined also to within this same multiplicative constant, and is found to 
satisfy the adjoint equation of (1.1), namely 


dloga dd dAlogb dd logab 
(3.4) — + = 
ov Ou Ou Ov Oudv 


Consequently the relation 


will determine two functions / and / to within a common additive constant 
of integration, which serve to define a net N’:(x’) parallel to V, by means of 
equations similar to (1.2).* The function 6’ =h0—t=/0—s is then a solution 
of the point equation of N’ corresponding to @. 

Thus we have found transformations F of N by means of the given 
6 having the given function C(u, v) as their common invariant C. The 
nets N, determined in this manner as F transforms of N are given by the 
equation 


6 
— (x’ + nx),T 


(3.6) 
+ nd 


in which » is an arbitrary constant. 


FUNDAMENTAL THEOREM I. A solution 0 of the point equation (1.1) of a 
net N, and a function C(u, v) which satisfies (3.2) determine 1 nets Ny which 
are F transforms of N by means of @ having as their invariant C the given 
function C(u, v). Any two of the nets N, are radial transforms of one another. 


The last part of the theorem can be proved directly from (3.6); but it 
will be made evident by the corollary of §6. 

4. Conjugate triads. If N, and N2 are F transforms of the net N by 
essentially different solutions, 6; and 4, (0:+c®2), of its point equation (1.1), 
but along the same conjugate congruence, they are themselves in relation 
F. The transformation F carrying N into N; (¢=1, 2) we indicate by Fi, 
and that carrying N; into N2 by F;; then 


* Eisenhart, T. S., § 4, (18), (19), (20) and also the next theorem stated there. 
1 This result is obtained by availing ourselves of a translation of the codrdinate axes. 


(3.5) 


TRANSFORMATIONS OF SURFACES 


== Fo. 


Three nets so related to one another will be referred to as a conjugate triad 
of nets. 

From (1.5) it is noted that the transformations F; and F:2, and therefore 
also F;, have different harmonic congruences, corresponding lines of which 
are concurrent. 

If we indicate the invariant C of F; by C; (¢=1, 2, 3), we have from (2.2) 


(4.1) Cy (x21,2y), C2 (xx2,2y), C3 (x1%2,2y). 
Hence 
(4.2) C:C3 = Ge: 


5. Harmonic triads. Suppose now that NW, and N, are F transforms 
of N by means of the same solution @ of its point equation (1.1), but along 
different conjugate congruences. The nets N, and WN, will be, in this case 
also, F transforms of one another.* Three nets related to one another in 
this manner will be referred to as a harmonic triad of nets. Using the same 
symbolism as in the preceding section we may again write 


FF; = Fo. 


’ Any two of the three nets in a harmonic triad are obtained as F transforms 
of the third by means of the same solution of its point equation. Because 
of this fact, we see from (1.5) that the three transformations F involved have 
the same harmonic congruence. It is to be noted also that the transformations 
F in a harmonic triad of nets have different conjugate congruences, corre- 
sponding lines of which are coplanar. 

Using the definition of C as embodied in (2.1), we conclude that here, too, 


C:C3 = Ce. 


6. Product of two transformations F. Suppose that the net WN, is 
transformed into the net N; by the transformation F, (i, 7, k=1, 2, 3 
cyclically), and let LZ; be a line of the harmonic congruence of F;. Since 
L, is the intersection of the tangent planes to NV, and N;, the three lines L 
must be either concurrent, or all three coincident. If the lines Z are concur- 
rent, the triangle formed by the focal points at the intersections of the 
tangents to the w-curves with one another (cf. (1.5)) will be in the relation 


* Eisenhart, T. S., § 20. Eisenhart has applied the term ériad to what we call a harmonic 
triad, and has given no name to our conjugate triad. These terms have been introduced in the light 
of the duality existing between the two types of triads. 


1928] 195 


196 M. M. SLOTNICK [January 


of Desargues with that formed by the other three focal points. Each of the 
sides of one of these triangles intersects the corresponding side of the other 
triangle in a point of one of the three nets. Hence, corresponding points of 
the three nets are collinear and the three nets form a conjugate triad. 

If corresponding lines LZ are coincident, the three transformations 
F have the same harmonic congruence; that is, the three nets form a harmonic 
triad.* 

This result combined with those of §§4, 5, yields 


THEOREM II. [f the product of two transformations F is a transformation F, 
the three nets in question form either a conjugate or a harmonic triad; and, in 
either case, the invariant C of the product transformation is equal to the product 
of those of the two given transformations. 


As an immediate consequence, we conclude 


Corotiary. If two transformations F of a net by the same solution of its 
point equation have equal invariants C, the two F transforms are radial trans- 
forms of one another; and conversely, if two non-radial F transforms of a net N 
along different conjugate congruences are radial transforms of one another, 
the two transformations F are by means of the same solution of the point equation 
of N and have equal invariants C. 


7. Transformations F in homogeneous point coérdinates. The point 
equation of a net N on a surface S:x =x (u,v) in a space referred to a homo- 
geneous point codrdinate system is of the form 
00 dloga 00 log b 00 


=——- —+ + 
ov Ou Ou = av 


(7.1) 

If @ is a solution of this equation, the point x; defined by 
Ox, Ox, 

Ou 6 dv dv\ 


traces a net N, which is an F transform of N. The points with the codrdinates 
0x,/du and 0x,/dv are the focal points F,; and F, (cf. (1.5)) of the line of the 


* There is also the case in which the three nets NV form both a conjugate and a harmonic triad. 
Two nets which, with the net N of §1, form such a configuration are those along the same congruence 
conjugate to N and by means of the same solution 6 of the point equation of VN. However, the two 
“corresponding” solutions of the point equation of N’ differ by a constant (6’ and 6’+ constant 
(cf. (1.4)). Three nets so related may be considered as forming either type of triad. 

+ Eisenhart, T. S., §§ 30, 37, 38. 


1928] TRANSFORMATIONS OF SURFACES 197 


harmonic congruence. The invariant C of the transformation F is found to be 


(7.3) 


ll 


The condition of compatibility of (7.2) assumes the form (3.1) by virtue 
of the fact that @ is a solution of (7.1). 

For a radial transformation C=1; i.e., t=s. In this case, because of 
(3.1), both ¢ and s are equal to the same constant. Equations (7.2) can then 
be integrated and 


x 
(7.4) x1 


is obtained as the equation of a radial transformation. Here p represents the 
coérdinates of the center of the transformation. 

From (3.1) we deduce that the condition on the invariant C in this case 
is precisely of the same form as (3.2). 

Given, conversely, a net N whose point equation is (7.1), a solution 
6 of (7.1), and a function C(u, v) satisfying the condition (3.2). Just as in 
§3, we find that a net JN, is determined as an F transform of N to within a 
radial transformation*; and that the invariants C of the transformations F 
are equal to the given function C. 

If we replace the codrdinates x of N by 6x, where @ is a solution of (7.1), 
the point equation assumes a similar form but with c=0.¢ In this event, 
equations (7.2) become similar in form to those for a parallel map in terms 
of non-homogeneous coérdinates. In fact, as Eisenhartt points out, the study 
of transformations F in terms of homogeneous coédrdinates can be made in 
this way analytically equivalent to that of parallel maps in terms of non- 
homogeneous coédrdinates. In such a development the invariant C of the 
transformation F corresponds to the invariant of the parallel map.§ 


II. THE INVARIANT H OF A TRANSFORMATION F 


8. Nets and transformations F in terms of tangential coordinates. 
The tangential codrdinates of a surface S whose point coérdinates are x= 
x(u, v)|| are the direction cosines of its normal: 


* Inasmuch as we cannot avail ourselves of a translation as we did in § 3, it cannot be concluded 
here that there are only ©! transformations F determined. 

t Eisenhart. T. S., § 37. 

t Ibid., p. 89. 

§ W. C. Graustein, Parallel maps of surfaces. 

|| From now on we restrict ourselves to three-dimensional space. 


198 M. M. SLOTNICK 


(8.1) 
and the distance of the tangent plane from the origin: 


(8.2) w = x). 


The parametric curves of S wiii form a conjugate net NW if and only if the 
tangential coérdinates ¢ and w satisfy an equation of the form 
dloga log B AA 


8.3 —-+ + r, 
(8.3) Oudv Ov Ou Ou Ov 


which is known as the tangential equation of N. 

Let (8.3) be the tangential equation of the net NW of §1; and let £, w be 
its tangential codrdinates. The tangential codrdinates of N, will be written 
{1, w. The function \=(¢|x’) is the fourth tangential codrdinate of N’, 
and is also a solution of (8.3). 

The net N/ traced by the point 


(8.4) 


is a radial transform of N’ by means of 6’, and is parallel to N;.f Its fourth 
tangential codrdinate is ). 

The transformation F of §1 is represented in terms of tangential coérdi- 
nates by the equations 


- 
Ou\rA, Ou\ Ov\A1 Ov\A 


When we reconcile these equations with those of §1, we find that 


*The inner product of the two triples x: (x', x*), y: (y', is represented by (x] y); 
and their outer product by xXy. In this way we have (xXy|z)=(xyz), the latter term being the 
determinant of x, y and z. Also 


Ox Ox|Ox_ Ox 
Ou Ovi du dv ), 
where E£, F, G are the coefficients of the square of the linear element of S. 
+ Eisenhart, Differential Geometry, §§ 66, 67. 
t Eisenhart, T. S., § 15. 
§ Eisenhart, T. S., § 52, (26). 


[January 
1° Ox Ox 
x 
=— 
6’ 


1928] TRANSFORMATIONS OF SURFACES 


(8.6) 


9. The invariant H. At the focal point F, (cf. (1.5)) of the line L of 
the harmonic congruence, the tangent plane to any ruled surface of that 
congruence is the focal plane there; and a similar situation exists at the 
other focal point F;. Consider an arbitrary ruled surface of the harmonic 
congruence. The coédrdinates of the points of contact, P and P,, of the tan- 
gent planes to N and J, with this arbitrary ruled surface depend linearly 
on the value of du/dv for this ruled surface. Thus, as du/dv is allowed to 
vary, there are defined on the line of the harmonic congruence two ranges of 
points, the one traced by P and the other by P;, which are in projective 
correspondence. The fixed points of this projectivity are F,; and F:. Ac- 
cordingly, the invariant of this projectivity 


(9.1) H = (PPi, FF) 


is a projective invariant of the transformation F which we call the harmonic 
invariant, or briefly, the invariant H. 

Evidently the invariant H of the inverse of a transformation F is the 
reciprocal of that of the given transformation. 

The value of H may be obtained by computing the codrdinates of P and 
P, in terms of point codrdinates using the formulas of §1. It is found that 


(9.2) 


An alternative method of obtaining this result is to use the fact that the 
cross ratio (9.1) is equal to that in which the tangent planes to N and MW, 
are divided by the focal planes at F; and F,. The tangential codrdinates 
of the first two planes mentioned are respectively £, w and {,:. Those of 
the focal planes are proportional to 


Hence (9.1) becomes 


(9.3) 


By virtue of (8.6) this result is seen to be equivalent to (9.2). 


* Here e, f, g are the coefficients of the second fundamental quadratic form for the surface of 
the net J. 
t Eisenhart, T. S. § 52, (27). 


199 
g.D,0'4 
i= ———, §= — —— :* 
eDi*s gDis* 
eg 1 i 
—-—i—, ——i— and 
H=~. 
3 


200 M. M. SLOTNICK [January 


A somewhat different geometric consideration brings to light another 
meaning of the invariant H. As we leave the point x of N in the direction 
du/dv, the tangent plane there twists about the conjugate direction. Con- 
sequently we can obtain the invariant H just as the invariant C was ob- 
tained,* except that each direction is now to be replaced by its conjugate 
direction. In this way, F:, F: of (2.1) are interchanged; and D and D, are 
replaced by P and P,, where the latter two points are the intersections of 
the line of the harmonic congruence with those tangent lines at x and 
to the surfaces of N and JN, in the directions conjugate to the lines from 
x and x, to D and D,. 

10. Perspective transformations. The conditions of integrability of 
(8.5) can be written in the form 


(10.1) 


(1 
a 


where we have made use of (8.3) and (9.3). 

If H =1, i=5 and their common value is seen from equations (10.1) to 
be constant. Equation (8.5) can be integrated in this case and we obtain 
(10.2) A =—> 

A A A 
where 


(10. 3) A = [(¢ + re) | + Av) 


Hence the lines of intersections of corresponding tangent planes to N and N, 
all lie in a fixed plane. We refer to a transformation of this type as a per- 
spective transformation. 

Inasmuch as a perspective transformation in terms of tangential coédrdi- 
nates is analytically equivalent to a radial transformation in terms of point 
coérdinates, we may say that they are duals of one another. 

A simple example of a perspective transformation is the parallel map, in 
which the plane of perspectivity is the plane at infinity. 

11. Fundamental theorem. From equations (10.1) in a manner similar 
to that of §3 we obtain, as the condition on the invariant H of the trans- 
formation (8.5), 


* Cf. § 2, above. 


dv H/ dv a 


1928] TRANSFORMATIONS OF SURFACES 


11.1) + 1 ~| 0 
—+—|(— log — — H)— log — |= 0. 
H ov a Ov Ou B 


Conversely, given a net NV whose tangential equation is (8.3), a solution 
d of (8.3), and a function H(u, v) satisfying (11.1). Equations (10.1) and 
SH =i will then form a compatible system by means of which two functions 
i and § are defined to within the same multiplicative constant. The equations 


(i= 1, 2, 3), 
(11.2) 


are then compatible; and the functions w,(k =1, 2, 3, 4), so defined, all satisfy 
the equation 
0? w B ow 


11.3 = — —lo ——-- 
( ) H ov Ou Ou v 


The five functions 
Wi 


(w| 


ii = 


(11.4) 
W4 1 


stow Welw 


= 


are solutions of the equation 
0 at Or 
— = — log — 
av (w| w)? au 


(11.5) 


+ — log 


Ou (w|w)1/2 au 


From (11.4), ({:|f)=1. Thus the functions ¢; and w, can be considered 
as the tangential codrdinates of a net N, whose tangential equation will be 
(11.5). Moreover, since (11.2) assumes the form (8.5) when the functions 
w are replaced by {1, w:; and Xi, as indicated by (11.4), the net NM; is an F 
transform of NV. The analytic work here is the same as in §3, but the inter- 


* (1, &, ¢ are the three ordered components of ¢ (cf. (8.1)). 


201 
— =i-{—}, — = 5—(—}. 
Ou Ou\ ov Ov\ A 
7 ¢= 1, 2, 3, 
|, 
( ow ow 
| wx 
On; Ou ov | 
(w|w)? 


202 M. M. SLOTNICK [January 


pretation now is that these quadratures determine ¢; and w, only to within a 
perspective transformation (cf. equations (10.2)). 


FUNDAMENTAL THEOREM III. A solution d of the tangential equation (8.3) 
of a net N, and a function H(u, v) which satisfies (11.1) determine nets N,; 
to within a perspective transformation which are F transforms of N by means 
of d; the invariant H of these transformations F is the given function H(u, 2). 


12. Triads. We have seen that if N, and N, are F transforms of a net 
N (with (8.3) as its tangential equation) and if N, and N2 are themselves 
in relation F, then the nets V, N,, Nz form either a harmonic or a conjugate 
triad.* From the nature of the function Af of (8.5), which we say is the solu- 
tion of the tangential equation of N used in the transformation F, we see 
that if 

(i) the same d is used 1n (8.5) to obtain NV; and N2 as F transforms of JN, 
the three nets form a conjugate triad; and if 

(ii) different solutions \ are used, the three nets form a harmonic triad. 

We have thus the following dual relations: 

If Ni, N2, Nz are three nets in a conjugate[harmonic] triad, the three 
transformations F have the same conjugate[harmonic] congruence but 
different harmonic[conjugate] congruences; any two of the nets are obtained 
from the third by means of the same solution \[6] of its tangential [point ] 
equation, but by different solutions of its point [tangential] equation. 

In the same way the invariants C and H, and radial and perspective 
transformations are duals of one another. 

The methods used in proving the results embodied in the theorem of §6 
are applicable to the invariant H also. Consequently we now have the 
complete 


THEOREM IV. If the product of two transformations F is a transformation 
F, the three nets in question form either a conjugate or a harmonic triad. In 
either case, the invariants C and H of the product transformation are equal 
respectively to the product of the invariants C and to the product of the invariants 
Hi of the given two transformations. 


As in §6, we also have, dually, the 


Corotiary. If two transformations F of a net by the same solution of its 
tangential equation have equal invariants H, the two F transforms are per- 


* Cf. § 6, above. 

t Le. A=(¢|x’), cf. § 8. Here is the fourth tangential codrdinate of NV’, the net parallel to N, 
whose point codrdinates are the direction parameters of the conjugate congruence of the transform- 
ation. 


1928) TRANSFORMATIONS OF SURFACES 203 


spective transforms of one another; and, conversely, if two non-perspective F 
transformations with different harmonic congruences of a net N are perspective 
transforms of one another, the two transformations F are by means of the same 
solution of the tangential equation of N and have equal invariants H. 


13. Transformations F in homogeneous tangential coordinates. 
Analytically, the study of transformations F, whether in terms of homo- 
geneous point coérdinates, or in terms of homogeneous tangential codrdi- 
nates, is the same.* The work and fundamental theorem of §7 need therefore 
only to be dually interpreted to obtain the facts for transformations F in 
terms of homogeneous tangential codrdinates. 

14. The invariants C and H as products of invariants. The transfor- 
mation F as set up by Eisenhartf is the product of a parallel transformation 
P,, a radial transformation R2, and another parallel transformation P3; i.e., 


P; R; 
— N,. 


Graustein{ has shown that the invariant C of the product transformation F 
is equal to the product of the invariants (C) of these factor transformations. 

Consider now the case of the invariant H. Since P; and P; are parallel, 
i.e., perspective, transformations, Hi=H;=1. For R:, C.=1, and hence, 
from (9.2), H2=e; g’/(gi e’), the quantities bearing on N/ and N’. From (1.2) 


Since 


we also have 


Hence, since C=i/s, 


* Eisenhart, T. S., §§ 37, 38 and §§ 51, 52. 

ft Eisenhart, T. S., § 15. 

t Cf. W. C. Graustein, Am invariant of a general transformation of surfaces, § 5. 
§ Eisenhart, T.S., § 15. 


ec=he, 
| 
Oxi h dx; Oxi lL dx, 
Ou t ou Ov s ov 
t 
e 1 
Me = 
412 


204 M. M. SLOTNICK 


As a result 
H = H,H2H3. 


TuHeoreM V. The product of the invariants H of the parallel, radial, and 
a second parallel transformation into which a transformation F can be factored 
(in Eisenhart’s way) is equal to the invariant H of the transformation F. 


The transformation F as considered by Jonas* was built up of a radial 
transformation R,, a parallel transformation P2, and another radial trans- 
formation R3; i.e., 


W 3 


In this case also Grausteint has shown that the invariant C of the trans- 
formation F is equal to the product of those of Ri, Ps, Rs. 

The invariant H for P2, i.e. Hz, is unity since P, is a perspective transfor- 
mation. For R,; and R3, C})x=C;=1. Hence, from (9.2), 


é 
ge 


=12, Thus, again, 
H = A, 


THEOREM VI. The product of the invariants H of the radial, parallel and 
second radial transformation into which a transformation F can be factored 
(in Jonas’ way) is equal to the invariant H of the transformation F. 


We are led to the conclusion from these facts that the methods of Eisen- 
hart and Jonas are duals of one another. 


III. NETS OF SPECIAL TYPE 


15. Nets with equal invariants. The point equation of the net M, 
determined in §1 as an F transform of the net N having (1.1) as its point 
equation is 


: = — log | — }}— + — log {| — }—- 
Oudv av ad ou Ou 0’ dv 


* Cf. footnote on Jonas, Introduction. 

t W. C. Graustein, An invariant of a general transformation of surfaces. 
t Cf. Eisenhart, T. S.,§ 16, (21). 

§ Eisenhart, T. S., § 15. 


(January 


1928] TRANSFORMATIONS OF SURFACES 


If the point equation of (1.1) has equal point invariants,* 


( ) 0 
og | — } = 0, 
b 


and conversely. Thus JN, will also have equal point invariants if and only if 
U(u) 
VQ)” 
where U is a function of wu alone and V of v alone. 
The net NV whose tangential equation is (8.3) has equal tangential in- 


variants} if and only if 
3? \ ( a ) 0 
og | — }) = 0. 
B 


Hence JN, having (11.6) as its tangential equation will also have equal 
tangential invariants if and only if 


0? log H 


U(u 
=0; ie. 


THEOREM VII. An F transform of a net N having equal point [tangential | 
invariants will also have equal point|tangential| invariants if and only if the 
invariant C[H | of the transformation is of the form U(u)/V(2). 


16. Transformations F with constant invariants. Consider two trans- 
formations F of a net N by means of the same solution @ of its point equation. 
Let the two invariants C of these transformations, C,; and C2, be constant. 
Equation (3.2) yields 


(16.1) (- 1) (=) 
og {| — 
C; a 


Thus, either 
(i) 


or 


* Eisenhart, T. S., § 6. 
t Eisenhart, T. S., § 53. 


205 
+ (1 C3) ] (=) 0 (4 = 1,2) 
— C;) — log {—) = 4= 1,2). 
ad b 
—— log (—) =0 and —— log (=) = 0; 
Oudv a b 
= 
1 
(ii) 0 
ii = (). 
—-1 1—C, 


206 M. M. SLOTNICK [January 


If (i) obtains, N and the two F transforms have equal point invariants. If, 
then, V has unequal point invariants, (ii) must hold; i.e., 


(1 — Ci)(1 — C2)(Ci1 — C2) = 0. 
From this result and its dual we are led to 


TuHEorEM VIII. If, in a harmonic|conjugate| triad of nets with unequal 
point|tangential| invariants, the invariants C|H] of the three transformations 
F are constant, at least one of the transformations is radial| perspective]. 


However, if (ii) does not hold, (i) must, and N will have equal point 
invariants. The F transforms will also have equal point invariants (cf. §15). 


Tueorem IX. If a net N admits of two non-radial [non-perspective| 
transformations F by means of a given solution 6[d] of its point [tangential] 
equation with constant but unequal invariants C[H], the net N has equal 
point [tangential] invariants; and the F transforms also have equal point 
[tangential] invariants. 


17. Transformations K and. A transformation F for which C= —1 
is called a transformation K,* and one for which H = —1, a transformation 
Q.¢ For a transformation K, equation (3.2) yields 


3? a 
log (+)- 0; 
b 


and for a transformation 2, equation (11.1) becomes 


(=) 0 


Thus we have the known fact that two nets in relation K[Q] have equal 
point [tangential ] invariants. 
If N has equal point invariants its point equation is of the form 


070 lo lo 00 
(17.1) _ 9 logy 98 
ov ou ou dav 


Equation (3.2) for a constant invariant C of an F transformation of such 
a net yields 


* Eisenhart, T.S., § 25. 
t Eisenhart, T. S., § 53. 


1928] TRANSFORMATIONS OF SURFACES 


17.2) c) = 0 


We may say that in general 


3? 6 
log (=) ~ 0. 
Oudv y 


This fact and its dual leads to 


TuHeorEeM X. A non-radial[non-perspective| transformation F of a net 
with equal point|tangential| invariants having a constant invariant C[H] is, 
in general, a transformation K[@]. 


18. The product of the invariants C and H. From (9.2) we see that if 
CH =1, we have 


(18.1) 


Hence 


THEOREM XI. [f two nets correspond by a transformation F for which 
the product of the invariants C and H is unity, the surfaces of the nets are so 
mapped that their asymptotic lines correspond. Conversely, if the surfaces of 
two nets in relation F are so mapped that their asymptotic lines correspond, 
the product of the invariants C and H of the transformation F is unity. 


From (9.2) we conclude 


THEOREM XII. If the paramelers of one of two nets in relation F are iso- 
thermal-conjugate,* those of the other net are also isothermal-conjugate if and 
only if the product of the invariants C and H of the transformation F is unity. 


We can go a step farther. A net is isothermal-conjugate if, when para- 
metric, e/g =U(u)/V(v).f Consequently, using Theorem XII and (9.2) 
we have 

THEOREM XIII. [f the nets N and N, in relation F have three of the following 
four properties, they have the fourth also: 

(a) N and N, have equal point invariants; 

(b) N and N, have equal tangential invariants; 

(c) N is isothermal-conjugate; 

(d) is isothermal-conjugate. 


* Eisenhart, Differential Geometry, pp. 198-199. 
{ Eisenhart, Differential Geometry, loc. cit. 


7 
207 
é 
g 


208 M. M. SLOTNICK [January 


If (a)[(b)] holds and (c) and (d) both hold, and one of the nets N and N, 
has equal tangential point| invariants, then the other net also has equal tan- 
gential| point | invariants. 


We may write (9.2) in the form 


(18.2) 


where K and K, are the Gaussian curvatures of the surfaces carrying N 
and N,. 


THEOREM XIV. If the nets N and N, in relation F are real and the param- 
eters are real, their surfaces have their Gaussian curvatures of the same or op- 
posite sign according as the product CH of the transformation is positive or 
negative. 


IV. TRANSFORMATIONS OF RIBAUCOUR 


19. O-nets and conjugate normal congruences. The curves of a net NV 
form an orthogonal system; i.e., N is an O-net, if and only if they are the 
lines of curvature of the surface of N. The point equation of the O-net of a 
surface is 


070 0 log E'/? 06 dlogG'/? 06 
(19. 1) = + 
dv Ou ou 


and its tangential equation is 


— bg —— — + — kg —— — 


(19.2) = . 
dv E'l? ou G'/2 dv 


The congruence of normals to a surface is conjugate to its O-net. In 
fact the spherical representation of this O-net N* serves as a parallel net NV’ 
whose coérdinates are direction parameters of this normal conjugate con- 
gruence. Let ¢ be the direction cosines of the normals: 


Ox 


g 
19.3 . 
( ) G dv 


*The surface of N is assumed to be neither a sphere nor a developable. 


| 
_ (=) 
K giD 


1928] TRANSFORMATIONS OF SURFACES 
If @ is an arbitrary solution of (19.1), equations 


Op g 00 


G 


(19.4) 


are compatible, and a function so defined is a solution of the point equation 
of the spherical representation N’ of the O-net N. Hence an arbitrary F 
transform N, of N along its normal conjugate congruence has the coérdinates 


(19.5) Xo 
For the surface of No, 
(19.6) 
where 
(19.7) 
If 06o/du =0, is constant, and the net of (19.5) is parallel 
00,/du =0, 00,/dv equations (19.7) and (19.4) show that 00/du=0, 


06/dv~0. Thus 6 and @ are functions of v alone. In this case we have for 
the directions of the curves of N.* 


(19.8) 


Such a transformation is of a type studied by Graustein} and termed by him 
a semi-parallel map. 


THEOREM XV. A necessary and sufficient condition thai an F transform 
of an O-net along its normal conjugate congruence be an O-net is that the trans- 
formation be either parallel or semi-paraliel. 


* Eisenhart, T. S., § 15, (19). 
1 W. C. Graustein, Semi-parallel maps of surfaces, Annals of Mathematics, (2), vol. 27 
(1926), p. 271. 


ap 
Ou E ou dv | 
| 
OXo Ox 
-(S-1)5, 
Ou E Ou 


210 M. M. SLOTNICK [January 


20. Transformations R. If N’:(x’) is an arbitrary net parallel to the 
O-net N, the equations 


00 00 re 
Ou Ou Ov 


are compatible, and @ will satisfy (19.1). The function 


(x | x’) 


(20.2) = 
2 


is a “corresponding” solution of N’ in the sense of §1. Thus we may obtain 
a net N;: (x1) 
20 


(20.3) 


(x’| x’) 


as an F transform of NV. Here J, is also an O-net. 

As a matter of fact, the surfaces of the O-nets N and N; are the sheets 
of the envelope of a two-parameter system of spheres, the curves of N and 
N, being the loci of the points of contact of the spheres.* This transformation 
F of N into N,, as indicated by (20.3), is termed a transformation of Ribaucour, 
or briefly, a transformation R. 

The invariants C and H of a transformation R enter in the symmetrical 


relations 


(20.4) 
g1e Gi€ 

The points with coordinates ¢ and {, the direction cosines of the normals 

to N and N,, trace O-nets on the unit sphere which are in relation F.{ The 

point equations of these nets on the unit sphere are the same as their tan- 

gential equations and are also equal to the tangential equations of the 


nets N and N;. Thus 


THEOREM XVI. The invariant H of a transformation R is equal to the in- 
variant C( =H) of the transformation F existing between the spherical represen- 
tations of the nets in the relation R. 


* Eisenhart, T. S., §§ 68-72. 
T E, Fe G are the coefficients of the square of the linear element of the spherical representation 


of the surface of S. 
t Eisenhart, T. S., § 78, (11). 


1928] TRANSFORMATIONS OF SURFACES 211 


21. Applications. We return to the equations (20.4), and exclude 
radial and perspective transformations. 
If the transformation R is also K,'C = —1 and 


(21.1) 


and conversely: Since N and N, have equal point invariants (cf. §17), their 
surfaces are isothermic. Conversely, if the surfaces of O-nets in relation R 
are isothermic, we can choose parameters so that (21.1) holds. Moreover, 
(21.1) is the condition that the map be conformal. 

The following theorems are thus readily obtained: 


THEOREM XVII. A necessary and sufficient condition that the surfaces of 
O-nets in relation R be conformally mapped is that both surfaces be isothermic. 
The transformation is then also K.* 


THEOREM XVIII. A necessary condition that the two surfaces whose O-nets 
are in relation R be isometrically mapped is that the transformation be also K. 
Both surfaces are then isothermic. 


- THEOREM XIX. A necessary and sufficient condition that the spherical 


representations of two surfaces whose O-nets are in relation R be conformally 
mapped is that the transformation be also Q. 


THEOREM XX. A necessary condition that the spherical representations of 
two surfaces whose O-nets are in relation R be isometrically mapped is that the 
transformation R be also Q. 


Finally we have 


THEOREM XXI. If two surfaces whose O-nets are in relation R are mapped 
conformally (or isometrically) and if their spherical representations are also 
so mapped, the transformation R is both K and Q; and the surfaces and their 
spherical representations are isothermic. Conversely, if these surfaces and their 
spherical representations are isothermic, the surfaces are conformally mapped 
and so also are their spherical representations, and the transformation R is 
both K and Q. In these cases of a transformation R being both K and Q, 
the asymptotic lines of both surfaces correspond. 


* Theorem of Cosserat, Eisenhart, T. S. § 82 and footnote (61). 
t Cf. Theorem XI. 


g 


M. M. SLOTNICK 
If the O-net of a minimal surface is parametric, 


(21.2) 


Thus we are led to 


THEOREM XXII. When the O-nets of two minimal surfaces are in relation 
R, the invariants C and H of the transformation are equal; and, conversely, if 
the invariants C and H of a transformation R are equal and the surface of one 
of the O-nets is minimal, so is the surface of the second. 


HARVARD UNIVERSITY, 
CAMBRIDGE, Mass. 


E 


| | 


THE FOUNDATIONS OF A THEORY IN THE CALCULUS 
OF VARIATIONS IN THE LARGE* 


BY 
MARSTON MORSE 


INTRODUCTION 


The conventional object in a paper on the calculus of variations is the 
investigation of the conditions under which a maximum or minimum of a 
given integral occurs. Writers have accordingly done little with extremal 
segments that have contained more than one point conjugate to a given 
point. An extended theory is needed for several reasons. 

One reason is that in applying the calculus of variations to geodesics, 
or to that very general class of dynamical systems or differential equations 
which may be put in the form of the Euler equations, it is by no means a 
minimum or a maximum that is always sought. For example, if in the problem 
of two bodies we make use of the corresponding Jacobi principle of least 
action} the ellipses which thereby appear as extremals always have pairs of 
conjugate points on them, and do not accordingly give a minimum to the in- 
tegral relative to neighboring closed curves, so that no example of periodic 
motion would be found by a search for a minimum of the Jacobi integral. In 
general if one is looking for extremals joining two points or periodic extremals 
deformable into a given closed curve, the a priori expectation, as justified by 
the results of this paper, in general problems, would seem to be that more 
solutions would not give an extremum than would give an extremum. 

A second reason for the study of extremal segments and periodic extremals 
that do not furnish an extremum for the integral is that if the ultimate object 
sought is an extremum, the existence of such extrema is tied up “in the large” 
with the existence of extremals which do not furnish extrema. It is one of the 
purposes of this paper to show the relations in the large between all sorts 
of extremals joining two fixed points, or deformable into a given closed curve. 

A first type of a priori existence theorem is Hilbert’s theorem concerning 


* Presented to the Society in part December 30, 1924, under the title Relations in the large between 
the numbers of extremals of different types joining two fixed points, and in part December 29, 1926, 
under the title The type number and rank of a closed extremal, and the consequent theory in the large; 
received by the editors January 15, 1927. 

Tt See P. Appell, Traité de Mécanique Rationnelle, Paris, 1919, vol. 1, pp. 547-548. 


213 


214 MARSTON MORSE [April 


the absolute minimum.* Reference should be given to the more recent 
work of Tonelli,{ also on the absolute minimum. Other references to studies 
of this sort will be found in the standard treatises. Birkhoffft effectively 
departed from the study of the absolute minimum alone, when he stated 
his “minimax principle,” and applied it to find closed trajectories that do 
not give a minimum to his “least action” integral. The present writer§ 
has shown that Birkhofi’s points of minimum and minimax appear as two 
types of critical points among m+1 such types, and has replaced Birkhoff’s 
inequality relation by m inequality relations and one equality relation. One 
of the objects of the present paper is to show how these »+1 relations be- 
tween critical points can be translated into relations between different 
types of extremals. 

It was necessary to develop for the first time a complete parallelism 
between types of critical points and types of extremals. It was found that 
the type of an extremal segment whose ends were not conjugate was com- 
pletely determined by the number of mutual conjugate poinis on the segment. 
In the case where the end points were conjugate it was necessary to bring in 
the envelope theory. 

Turning to periodic extremals it was found, even in the most general 
cases, that conjugate points would not serve to determine the type of a 
periodic extremal g, but that other relations of g to neighboring extremals 
had to be brought in. A periodic extremal was called degenerate if the cor- 
responding Jacobi differential equation possessed periodic solutions not iden- 
tically zero. It is shown for the first time how a parameter may be introduced 
into the integrand in such a fashion that the degenerate periodic extremal 
disappears and non-degenerate periodic extremals, or no periodic extremals, 
take its place. 

This theory is brought to a head in two applications, one “in the large,” 
giving relations between different types of extremals, and one “in the small,” 
showing how the type of a given extremal may be characterized in a third 
way, in terms of the possibility or impossibility of deformations of m- 
parameter families of neighboring extremals into families of lower dimen- 
sionality. This deformation theory was made possible by the application 


* Bolza, Vorlesungen tiber Variationsrechnung, 1909, pp. 419-437. Further references to Bolza 
will be indicated by the letter B. 

t Tonelli, Fondamenti di Calcolo delle V ariazioni, vol. 2. 

t Birkhoff, Dynamical systems with two degrees of freedom, these Transactions, vol. 18 (1917), 
p. 240. 

§ Marston Morse, Relations between the critical points of a real function of n independent variables, 
these Transactions, vol. 27 (1925), pp. 345-396. 


1928] CALCULUS OF VARIATIONS IN THE LARGE 215 


of certain powerful theorems of analysis situs. It is believed that this defor- 
mation theory will serve as the basis for an even more extended theory “in 
the large.” 


Part I. THE TYPE NUMBER OF AN EXTREMAL SEGMENT 


1. The integrand F(x, y, x, j) and the function J(u, --- ,v,). Let (x, y) 
be any point in an open two-dimensional region S of the x, y plane. Let there 
be given a function F(x, y, z, y) of class C’’’ (B, loc. cit., p. 193), and posi- 
tively homogeneous in x and ¥ of dimension one, for (x, y) in S and % and ¥ 
any two numbers not both zero. Corresponding to the calculus of variations 
problem in the parametric form with integral J, and with F(x, y, %, }) as 
the integrand, let there be given an extremal g of class C’”’ (B, p. 191), 
without multiple points, passing from a point A to a point B, and such that 
along g we have (B, p. 196) 


(1) Fi(x, y, x, 9) > 0. 


Let the arc length along g, measured from A toward B, be denoted by wu. 
Let 


(2) U1, U2, °** » Un 


be values of u increasing with their subscripts, and corresponding to 


points on g between A and B. Denote the value of u at A by wu, and its 
value at B by uni:. We suppose the points (2) so chosen on g that no one of 
the closed segments of g bounded by successive points of the set 


(3) uo, M1, Un+1 


contains a conjugate point of either end point of that segment. Let there be 
given m short arcs of class C’’’, say Ii, In, ---, hn, passing respectively 
through the points of g at which w takes on the values (2), arcs not tangent 
to g at these points. Let positive senses be assigned to Mi, fz, ---, ha in 
such a fashion that the positive tangent to g at u =u; has to be turned through 
a positive angle less than 7 to coincide with the positive tangent to fh, at 
the same point. Let v; be the arc length measured along 4; in h,’s positive 
sense from the point u=u;on g. We regard (u;, v;) as representing the point 
on h; at the distance v; from g. 
If each v; be sufficiently small in absolute value the points 


(4) A, (u1, (ue, v2), (tn, Un), B 


can be successively joined by unique extremal segments neighboring g. 


i 


216 MARSTON MORSE [April 


Let the integral J taken along this broken extremal joining A to B be denoted 
by 


(S) J(0, 


Corresponding to the extremal g the function (5) will be shown to have a 
critical point for (v1, ---, %n)=(0, 0,---, 0), that is @ point at which all 
of its partial derivatives are zero. We will also investigate the terms of second 
order in the expansion of the function (5) about (0, 0, - - - , 0). 

2. A transformation of the problem. We introduce the following lemma 
which simplifies the problem. 


Lemma. It is possible to make a transformation T from the (x, y) plane to 
the (u,v) plane with the following properties: 

(A) Under T the extremal g is carried into a portion of the u axis bounded 
by u=uo and U=Un41. 

(B) The transformation establishes a one-to-one correspondence between a 
suitably chosen region of the (x, y) plane enclosing g, and a region R, of the 
(u, v) plane enclosing y. 

(C) The transformation carries the arcs hy, he,-- +, hn into straight line 
segments on which u=U,, , Un, respectively. 

(D) The transformation is representable in the form 


(1) a(u, 2), y(u, v) 


where x(u,v), y(u,v) are of class C’"’ in R, and possess there a positive jacobian. 
(E) The transformation preserves arc lengths along g and hy, In, + + - ,hn. 


In the first preparation of this paper the complete details of the proof 
of this lemma were given, but second thought makes it seem not too much to 
leave to the reader. 

3. The integral in the (~, v) plane. Under the transformation of the pre- 
ceding lemma the integrand F(x, y, x, 7) can be replaced by a new integrand 
G(u, v, u, v), where x, 7, %, 0 stand for derivatives of x, y, u, v, respectively, 
with respect to a parameter ¢ (B, p. 344). We are concerned with properties 
of the extremal segment y, namely, 


(1) v=0, wm Su BS Unis, 


which corresponds under the preceding transformation T to the extremal g 
in the (x, y) plane. For the present we need only consider curves on which 
u>0, and set 


(2) G(u,v, 1,0’) = f(u,v, 


| 
dv 
—: 
du 


1928] CALCULUS OF VARIATIONS IN THE LARGE 217 


thus defining f(u, v, v’) for all points (u, v) neighboring y, and all numbers 2’. 
For this domain f(u, v, v’) is of class C’’’. The points which are enumerated 
in §1, (4), here correspond to poirits whose actual coédrdinates are 


(3) (uo, 0), (11, 01), (tn, Mn), (Un41; 0). 


Subject to the limitation that we deal here only with curves on which 
u>0, the integral J of §1 becomes here 


(4) =f 


0 
and the function 
(5) Un) 


is the value of the integral (4) taken along the successive extremals joining 
the successive points of (3), varying the coérdinates v;, but holding all 
coérdinates u; fast. It should be expressly noted that the function (5) is 
not a new function set up for the first time in the (u, v) plane, but that it is 
identical with the function J(v, - - - , Un) defined in §1. 

In terms of f(u, v, v’), and for the extremal y given by (1), we define 
the functions P(u), Q(u), and R(u) in the usual way (B, p. 55). These 
‘functions are of class C’ along y. Because of the assumption that F,>0 
along g, it follows that 


(6) R(u) > 0 


along y. The Jacobi differential equation corresponding to the extremal 
segment 7 will be written in the well known form 


(7) Rw” + R’w' + (’ — P)w = 0, 


with w the dependent variable, and u the independent variable. We are 
always going to write J. D. E. for (7). 

4. The second partial derivatives of - - - , Yn). We shall see presently 
that J(u,---, %%) has a critical point when (un, ---,v,)=(0,---, 0). 
To determine the nature of this critical point we proceed to the determination 
of the second partial derivatives of J(v:, - - - , vn). To that end we represent 
the family of extremals which join the points 


(1) (Misr, Vi+1) 
in the form 


(2) v = ri(u, 0%, Wi SUS 


218 MARSTON MORSE [April 


where it is understood that the coérdinates u; and u;,, are held fast. The 
functions 


(3) 


will be of class C’ in all of their arguments for u on the interval in (2), and 
for 2; and 2,4; neighboring zero (B, p. 73 and p. 307). 
We shall understand by w,, (w), a solution of the J. D. E., such that 


(4) Wy» (u,) = 0, Wy»(u,) = 1 (lu—v| =1, u,v =0,1,2,---,a+1). 
Because of the fact that the functions (2) satisfy the identities 


r*(u;, Vi, Vi+1), 


(5) 
it follows that 


= ri (u, 0, 0), 
(6) 


wi = 75,,,(u, 0, 0). 


Differentiation of the integral J, and integration by parts in the usual way 
will now give 
(7) Om) = for 02, — for[us, 02, (us, 05,0641) 
Here it is understood that 
(8) 4=1,2,---, #, 19 = 0, = O. 
From (7) we see that all the partial derivatives of J(m,---, v,) are zero 
at (0, - - - , 0), and note that J(u, - - - , v,) is of class C’ in the neighborhood 
of (0,---, 0). 

From (7) we see that 
(9) = 0, |i-—j| #1, 0. 


The remaining second partial derivatives will be evaluated at (0, - - - , 0). 
Evaluation at the latter point will be indicated by a subscript zero preceding 
the partial derivative. We obtain the following results:* 


= R(us) 0, 0) — rig. (us, 0, 0)] 


(10) R( us) — | (6=1,2,---,#); 


* A. Dresden has given complete formulas for the second partial derivatives of the extremal in- 
tegral (Bolza, p. 310). Use has not been made of this work, however, because the formulas given in 
the present paper need not have the general form given by Dresden, and do need a different notation. 


= 
q 


1928] CALCULUS OF VARIATIONS IN THE LARGE 219 


The last two partial derivatives are necessarily equal, as can be proved 
directly from the properties of the J. D. E. 
The matrix whose elements are 


(13) 45,5 = (i,j= m) 


has now been determined. Note first that all elements in the ith row of this 
matrix have the factor R(u;). Let this factor be removed from the ith row 
(¢=1, 2, - - - ,). We will write down the resulting matrix for a typical case, 
n=5. 


0 0 0 Weal ts) 45 (tes) — wes 


5. The rank of the extremal segment g. We prove the following theorem: 


THEOREM 1. Let the function J(m,---, Un) be set up for the extremal 
segment g, as described in §1. The matrix a, whose elements are 


is of rank n if the final point of g is not among the conjugate points of the tnitial 
point of g. Otherwise a is of rank n—1. 


To prove this theorem let us turn to the (wu, w) plane of the J. D. E., and 
in that plane join the successive points 


(uo, 0) (m1, ¢1), (tn, Cn) (Un+41, 0), 


by curve segments representing solutions of the J. D. E. The successive 
segments are of the form 


where 
(2) 4=0,1,---,#, co=O, Capi = 0. 


Let the curve obtained by combining the successive segments (1) be denoted 
by A. A necessary and sufficient condition that the point (uo, 0) be con 
jugate to (un4:, 0) is that among the curves X there exist at least one, not 


— 


220 MARSTON MORSE [April 


w=0, that has no corners at the junctions of its successive segments. The 
n conditions that \ have no corners at the junctions of these segments are 


(3) + | + = 0}, 
where 


Equations (3) may be written in the form 


(5) + — — = 0, 
subject again to the conditions (4). 

Let the matrix of the coefficients of (c:, - - - , cn) in (5) be denoted by w 
and the value of the corresponding determinant by |w|. With the aid of 
(9), (10), (11), and (12) of §4 we obtain the following equation, giving the 
determinant |a| of the theorem to be proved: 


(6) |a| = R(u2),---, Run) | . 


For |a|, and accordingly |w|, to be zero, it is necessary and sufficient that 
the equations (5), subject to the conditions (4), admit a solution (1, - - - , ¢n) 
in which the c,’s are not all zero. That is, for |w| and |a| to be zero, it is 


necessary and sufficient that (#o, 0) be conjugate to (#n41, 0). 

It remains to prove that if a is of rank less than n, its rank is exactly n—1. 
Suppose |a|=0. Then (uo, 0) is conjugate to (#n41, 0). The point (1, 0) 
cannot also be conjugate to (u,, 0), for otherwise (w,, 0) would be conjugate 
to (#41, 0), contrary to the original choice of (1, - - - , 4). Nowa considera- 
tion of the form of the minor A,_, of the element a,, of a shows that the 
vanishing of A,-: is the condition that (uo, 0) be conjugate to (wn, 0). Hence 
A,-:+0, and a is of rank »—1. Thus the theorem is completely proved. 

We denote by A; (t=1, 2,---, m) the determinant obtained from 
a by striking out the last m—i rows and columns of a, and set Ap=1. 


CoroLtary 1. A mecessary and sufficient condition that A;=0 (i=1, 
2, -++,) ts that (uo, 0) be conjugate to (ui+1, 0). 


This follows at once from the form of A; and from Theorem I. 
Corotiary 2. The matrix ais in normal form.* 


For if A, and A,4; (0<r<m) were both zero, (uo, 0) would be conjugate 


* Boécher, Introduction to Higher Algebra, p. 59. 


1928] CALCULUS OF VARIATIONS IN THE LARGE 221 


to both (u,, 0) and (u,4:, 0), contrary to the original restrictions on (t#, 
Ua, Un). 

6. The type number of the extremal segment g. We prove the following 
theorem: 


THEOREM 2.* If the final end point B of the extremal segment g of §1 is 
not conjugate to its initial point A, but there are k points (O<k<n) on g 
between A and B conjugate to A, then the symmetric quadratic form 


tj 


when reduced by a real non-singular linear transformation to squared terms 
only, will have k coefficients that are negative, and n—k that are positive. 

The number k is called the type number of the critical point (0, - - - , 0) 
of J(v1, - , and also the type number of g. 


To prove this theorem we shall make use of the well known fact (Bécher, 
loc. cit., p. 147) that in a regularly arranged quadratic form the type number 
k equals the number of changes of sign in the sequence Ao, Ai, - - - , An (§5) 
where an A; which is zero is counted as positive or negative at pleasure. 
To continue we adopt the method of mathematical induction. In the case 
n=1 we are concerned simply with the fixed end points (mo, 0) and (mw, 0), 
and an intermediate point (wm, 0). Here 


(2) A, = [ wo, 


It is readily seen that this difference is negative or positive according to 
whether or not (1%, 0) possesses a conjugate point prior to (#2, 0). We now 
distinguish between the cases in which A,_,=0, and A,_,+0. 

Case I. A,1+0. If we assume the validity of the theorem for the case 
where there are »—1 codrdinates 7, - - - , ¥n-1, We can conclude that there 
are as many changes in sign in the sequence Ag, - - - , An-: as there are con- 
jugate points prior to (w,, 0). To determine whether there is a change of 
sign in passing from A,-1 to A,, it is useful to consider again equations (5) 
of §5, unaltered except that the zero in the right hand member of the last 
equation is here replaced by 


(3) An1/An 


and the left hand member of the ith one of these equations is here to be mul- 
tiplied by R(u;) (¢=1,2, - - - ,). 


* The corresponding theorem in m dimensions has recently been discovered by the author. 


222 MARSTON MORSE 


The resulting equations can now be solved for the constants ¢, - - - , Cn. 
In particular we obtain, by Cramer’s rule, 


The curve \ of §5 corresponding t> these constants (¢:, - - - , cn) has just one 
corner, and that at (wn, ¢,). The last equation of (5), §5, altered as we have 
said, now tells us that at this corner the slope of the first segment of » 
exceeds or is less than the slope of the second segment of \, according as the 
right hand member (3) is positive or negative. This result can be interpreted 
in terms of the J. D. E. to mean that there is, or is not, a conjugate point of 
(uo, 0) between (w,, 0) and (#,4:, 0), according as the quotient (3) is negative 
or positive. Thus, in case A,_,+0, the total number of changes of sign in 
the sequence Ao, Ai, -- - , An equals the total number of conjugate points 
of (uo, 0) prior to (#n4:, 0). 

Case II. A,-,=0. In this case we make use of the fact that, if A,..=0 
in a regularly arranged non-singular quadratic form, then A,-2~0. Assuming 
then the validity of the theorem for the case where there are »—2 points 
between the end poirts of a given extremal we can conclude that there are as 
many changes of sign, say /, in the sequence Ao, Ai, ---, An-2 as there 
are conjugate points of (#o, 0) prior to (w,-1, 0). But since A,_,=0 it follows 
from Corollary 1, §5, that (w,, 0) is conjugate to (mo, 0) so that there are alto- 
gether h+1 conjugate points of (wo, 0) prior to (#n4:, 0). On the other hand 
it follows from the theory of regularly arranged quadratic forms that if 
A,-1=0, then A,_2 and A, have opposite signs. Thus in the sequence Ao, 
A;,---+, An, there are h+1 changes of sign, and the theorem is proved in 
Case II as well as in Case I. 


Part II. RELATIONS IN THE LARGE BETWEEN EXTREMALS JOINING A To B 


7. The integrand and region S. In the developement of this chapter 
we do not wish to exclude the case where the end points of the extremal seg- 
ment are conjugate. The complete treatment of an extremal segment whose 
ends are conjugate, both in our theory and in the classical theory, depends 
upon the nature of the singularity at B of the envelope of the extremals 
passing through A. Such a treatment, at least as developed so far, requires 
the assumption that the functions used be analytic. 


I. We therefore assume that the function F(x, y, x, 9) is positively homo- 
geneous of the first degree in x and ¥, and analytic in all of its arguments, for 
x,y any point interior to an open region S of the (x, y) plane, and x and y any 


= 
A2 > 0. 


1928] CALCULUS OF VARIATIONS IN THE LARGE 223 


two numbers not both zero. We shall also assume that for the same arguments 
Fi(x, x, > 0. 


In S the differential equations of the extremals can be put in the Bliss* 
form, 


d6 
— =sin@, = H(x, y, cos@, sin§), 


— Fs; 
F,[x? 
A necessary and sufficient condition that a solution g of (1) on which (x, y, 6) 
= (xo, yo, 90) for some value of s, be identical as a set of points (x, y), with 


the solution on which (x, y, 0) =(xo, yo, 90+) for some value of s, is that, 
on g, 


(3) H(x, y, cos@, sin@) + H(x, y, — cos@, — sin#@) = 0. 


A(x, y,%, 9) = 


II.¢ We assume that (3) holds identically for every point (x, y) in S, and for 
every 6. This type of problem is termed reversible. 


. 8. The regions S;, and R. We can choose between a great variety of 
assumptions that will serve as boundary conditions. The following perhaps 
are as simple as any. 


III. Let there be given a closed region S,, consisting of points interior to S, 
bounded by a simple closed curve B consisting of a finite number of analytic arcs. 
We assume that this boundary B is extremal-convext, in the sense that the interior 
angles at the vertices of B shall be between 0 and 7, and that an extremal tangent 
to an analytic arc B’ of B at a point P, shall in the neighborhood of P, except for 
P, lie wholly on that side of 8’ which is exterior to Sy. 


IV.t We assume further that the region S; is covered in a one-to-one manner 
by a proper field of extremals of the form 


(1) x=h(u,v), y= »), 


where u is the parameter, and v is the arc length measured along the extremals, 
where at every point (u,v) that corresponds to a point (x, y) in S, the functions 
(1) are single-valued and analytic in u and v, and 


* Bliss, these Transactions, vol. 7 (1906), p. 180. 

t The author has recently removed hypotheses II andIV, assuming then that F is positive de- 
finite. Powerful results obtain, but the proofs are necessarily more difficult. 

} Compare Bolza, loc. cit., pp. 276-278, and also Birkhoff, loc. cit., pp. 216-219. 


dx 
ds 
where 


MARSTON MORSE 


D= ~ 0. 
ky ky 
The region S; in the (x, y) plane will correspond in the (wu, v) plane to a 
closed region R, bounded by a simple closed curve y consisting of a finite 
number of analytic arcs, and again extremal-convex. When w is the inde- 
pendent variable, and v the dependent variable, the integral J can be put 
in the non-parametric form as in §3 with f(u, v, v’) the integrand. Because 
of the assumption of reversibility the non-parametric problem, with integrand 
f(u, v, v’), will here include all the extremals of the parametric form except 
those of the family «=constant. From our assumption regarding F,, we have 
here that 


Sorw'(u, v, v’) 0, 


for all points (u, v) in R, and any number v’. Concerning the region R, we 
can now establish the following lemma: 


Lemma. Let a and b be, respectively, the minimum and maximum values of 
uony. Then y consists of two arcs of the form 


v= A(u), v = asusb, 


where A(u) and B(u) are of class C, and analytic in u on the interval asusb, 
except for a finite set of values of u, while 


(2) A(u) < a<u<b, 
(3) A(a) = Bia), A(b) = 

Finally the interior points (u, v) of R are the points satisfying 

(4) A(u) <v < B(u), a<u<ob. 


To prove these statements let (wo, v0) be any interior point of R. There 
exist two numbers, 2; and 2, of such sort that the points (wu, v) for which 


(5) u = Uo, <0 < 


include the point (mo, vo), but no points other than interior points of R, 
while the points (mo, 11:) and (uo, v2) are on the boundary of R. Now the 
extremal-segment (5) is not tangent to y at either of its ends, since y is 
extremal convex. If y has a vertex at (mo, v2), the two analytic arcs adjoining 
(uo, 02) must lie on opposite sides of “=o, at least in the neighborhood of 
(uo, v2), for otherwise the points of (5) in the neighborhood of (uo, v2) 
would not lie in R. A similar statement applies to (mo, 2:)._ Thus, in 


é 
224 [April 


1928] CALCULUS OF VARIATIONS IN THE LARGE 225 


any case, there must exist functions h(u) and k(u) of class C, and constants 
a, and h;, differing from mp by so little, that the points (u, v) satisfying 


(6) h(u)<v< k(u), a,<u< by, 


are all interior points of R, while the curves 


v h(u), = h(uo) , a} s bi, 


k(u) ’ V2 k(uo) ’ 


(7) 


v a1 < ty < dy, 


lie on the boundary of R. 

We wish to show that these functions h(u) and k(u) can be extended in 
definition, as functions of class C, so that the preceding statements regarding 
; (6) and (7) still hold true when a;=a and }:=b. Whether this is true or not 
there certainly exist constants a, and b2, such that 


@Sa.<u<h SS), 


and functions h(u) and k(u) of class C, such that the curves 


v= h(u), = h(uo), 


(8) 


k(u), 2= k(uo) , ade S u be, 
lie on the boundary of R, and the points (u, v) which satisfy 
(9) 


are all interior points of R, while further, a, and 2 are constants such that 
the interval in (8), and the corresponding interval in (9), cannot be expanded 

at either or both ends, and the preceding statements about (8) and (9) 
7 still hold true for these expanded intervals. 


h(u) <0 < k(u), a2 <u < be, 


The following division into cases is exhaustive: 
Case I a2=a, b=b; 
Case ITI a<d2, be=b; 
Case IV a<de, 


Case I. In this case the theorem follows readily. 

Case II. In this case we will prove that the inequality b, <b is impossible. 
In the first place if we had h(b.) =k(b:), the curve (8) would completely bound 
the region (9), a sub-region of R. This is impossible, since R consists of a 
single connected region. Hence h(b.) <k(b:). The points (u, v) which satisfy 


(10) h(be) k(be), 


u = be, 


3 
| 


226 MARSTON MORSE [April 


are limit points of points of (9), and are accordingly points of R. We dis- 
tinguish again between two cases, Cases Ila and IIb. 

In Case Ila we suppose the points of (10) are all interior points of R. 
In this case (8) and (9) will hold as stated for a slightly larger b., and thus 
we have a contradiction. 

In Case IIb we suppose the points of (10) include at least one boundary 
point P of R. Such a point P cannot be a vertex of the boundary without 
the interior angle at the vertex being greater than 7, contrary to an hy- 
pothesis. Neither can P be an ordinary point of the boundary, for in that 
case the boundary would have to be tangent to the extremal segment (10), 
a result which is again impossible, since the boundary is extremal-convex. 
Thus Case IIb leads to a contradiction under all circumstances, and is 
impossible. 

Similarly, it can be proved that Cases III and IV are impossible. Case I 
alone is possible, and the theorem in italics follows readily. 

9. Further properties of the region R. In R no extremals other than the 
extremals u=constant can be tangent to the extremals w=constant. Hence 
every extremal segment other than the extremal segments u=constant is 
representable in the form v= M(u), where M(u) is an analytic function of u, 
for u on the interval (§8, Lemma) 


asusb, 


or some sub-interval of that interval. 

Let A and B be two distinct points of R, either interior to R, or on the 
boundary of R. Any extremal joining A to B in R will have no points on the 
boundary of R, except possibly its end points A or B. This follows readily from 
the extremal convex nature of the boundary (§8). 

We will now prove that two points A and B in R.can be joined in R by at 
most a finite number of extremal segments. If A and B are both on the same 
extremal segment u =o, that extremal segment is the only extremal segment 
in R which can join A to B. Suppose then that (wo, vo) and (#4, v:) are two 
points of R for which u»<, and which can be joined by an infinite set of 
extremal segments in R. Denote by m the angles in the (wu, v) plane which the 
tangents to these extremals at (uo, 9) make with a parallel to the positive 
u axis. Take these angles between — 2/2 and 7/2. 

Now there will be at least one limit angle m» of these angles m. The 
v codrdinate of an extremal E> issuing from (mo, vo) with the angle my can be 
continued as an analytic function of u until E> passes out of R or through 
(u;, 2%). The extremal Ey cannot pass out of R before passing through 
(m1, 0:), because if Ey did so pass out of R, extremals with angles m sufficiently 


1928] CALCULUS OF VARIATIONS IN THE LARGE 227 


near m, would also pass out of R before passing through (m, 2), which is 
impossible. 

It would follow then from the envelope theory of extremals that all 
extremals, without exception, that issue from (mo, v) with angles neigh- 
boring my would pass through (m, v;). Moreover, we could then prove that 
all extremals issuing from (uo, vo) with angles m for which 


(1) mo 


would remain in R, at least until they passed through (m, v;). For otherwise 
there would be a least upper bound m, < 7/2, of angles m at which extremals 
in R issue from (uo, vo) and pass through (m, 7:). But the extremal issuing 
from (uo, ¥o0) with the angle m, would lie entirely within R, except possibly 
for its end points, so that extremals issuing from (wo, vo) with angles slightly 
larger than m, would also lie in R, and pass through (m, 7,). Therefore no 
such least upper bound, m, for which m<7/2 exists. Thus all extremals 
which issue from (uo, vo) with angles m satisfying (1) must pass through 
(u:, 011) before passing out of R. 

But an extremal issuing from (wo, v9) with an angle sufficiently near 7/2 
will pass out of R without passing through (m, 7,:). From this contradiction 
we infer the truth of the statement to be proved. 

10. The function J(v:, ---,%,). None of the extremal segments u= wo in 
R have a point on them conjugate to either end point. To prove this let the 
J. D. E. in the Weierstrass form with independent variable ¢=v, be set up 
corresponding to the extremal ~=wp» in the (x, y) plane. This differential 
equation has as a solution the determinant D of §8, provided we set u= wo 
and let v vary. The absence of any conjugate points on “=p follows from 
the non-vanishing of this determinant. From the absence of conjugate 
points in R on the extremals u =o, the “regularity” (F, >0), and analyticity 
of the problem, and the extremal convex nature of the boundary, it follows 
that a positive constant e can be determined small enough to have the following 
properties. Any point of R for which u=uo, can be joined to any point of R 
for which u differs from uo in absolute value by less than e, by a unique analytic 
extremal h lying entirely within R, excepting possibly its end points, and such 
that on h there are no pairs of conjugate points (B, p. 307). The questions of 
uniformity arising can be met by the reader without too much difficulty. 

Now let there be given two points A and B of R, not on the same extremal 
segment “=u. We are concerned with the extremals joining A to B, if any 
such exist. Let uo, m1, - - - , Mn41 be a set of increasing values of u of which 
successive values differ at most by the constant e of the preceding paragraph, 
and which are such that A lies on w=, and B on u=u,,4;. Consistent with 


228 MARSTON MORSE [April 


this, let the codrdinates of A and B be respectively (wo, vo) and (t#n41, Un+1); 
and (2, - - - , 2n) be variables such that the points 


(1) (u1, 01), (un, Un) 
are all in R. Let the successive points of the set 
(2) (uo, Vo), (tn41, Un+1) 


be joined by the unique extremal segments of the preceding paragraph, 
and let the integral J be evaluated along the resulting curve. If we hold 
the end points A and B fast, as well as the u codrdinates of the intermediate 
points, the value of J will be a function, J(u, - - - , v,), that will be analytic 
in (v1, - - - , ¥n) in the domain (Lemma §8) 


(3) A(u;) S S 


The partial derivatives of the function J are given by the formulas 
(4) Un) = vi, pi) fo (ui, vi, qi) (i 1, 2; n), 


where ; is the slope at (w;, of the extremal joining v:-1) to (ui, 2;), 
and q; is the slope at (u;, v;) of the extremal joining to (wi41, 0:41). Since 


Sorw(u, v’) >0 


for every (u, v) in R and every v’, the partial derivative (4) is zero, when and 
only when ~;=q;:. We have the result that the function J(1,- +--+, has 
a critical point (0, -~--, Un), when and only when the points (2) all lie on a 
single analytic extremal. 

11. Relations between critical points. A lemma fundamental for our 
present purposes is derived from the Corollary to Theorem 8, page 392 of 
the paper of the author’s already cited. Before stating the lemma let it be 
agreed that the positive normal to an analytic boundary of a region 2 
will be understood to be that sensed normal that leads from points in 2 to 
points without 2. 


Lemma 1. Let there be given a closed region = in the space of the n variables 
(x1, -- +, %n), bounded by a closed analytic manifold, without singularity and 
homeomorphic with the interior and boundary of an (n—1)-dimensional 
hypersphere. In = let there be given a function f(x1, -- +, Xn), of class C’”’ at 
each point of 2, and possessing on the boundary of = a normal directional deriva- 
tive that is positive. Suppose the critical points of f(x:, - - - , Xn) are all of rank 
n. Of the type numbers k of these critical points, let m be the maximum. Let 
M;, be the number of critical points of type k. Then between the integers M, 
the following relations hold: 


4 
(i= 1,2,---,m). 


CALCULUS OF VARIATIONS IN THE LARGE 


Mo, 

My — Mi, 

Mo — Mi + Mz, 

Myo Mi+ M:— Ms, 


IV IA IV WA 


1S [Mo — Mit M2— Ms+ 1)™, 
1= Mi+ M34+ (— 1)"M,,. 


IIA 


In terms of the functions of the Lemma, §8, we can say that the domain 
of the points (1, ---, 2.) =(V) for which the function ---, of 
§10 is defined, is a rectangular hyperspace made up of points (V) satisfying 
the inequalities 


(1) A(u;) S S B(us) i coe, 


where the numbers (a, - - - , “,) are constants specified in §10. The boun- 
dary of the domain obviously consists of hyperrectangles on 2” hyperplanes 
v; = A(u;), v; = B(uj) (t=1,2,---,m), 

taken separately. 

To apply Lemma 1 of this section we need to find the normal directional 
-derivative of J(u, - - - , ¥,) at points on the boundary of (1). 

Let (a1, - - - , dn) be a point (V) of (1) for which a,=A(m), that is, 
a point on the bounding hyperplane, 7;=A(w). If we hold the variables 
(v2, --+, constantly equal to (a2,---, respectively, and let a 
decrease so as to pass through the corresponding point (1, de, --- , 
a,) will vary on a normal to the hyperplane 7;=A (wm), and pass out of the 
domain (1). But the point @n) will correspond 
(§10) to a succession of »+1 extremal segments joining A to Bin R. Of 
these segments the first two will adjoin each other in R at the point 
(u, v) a:). The partial derivative of J(u, - - -, with respect to 
at the point (a, - - - , @,) can be obtained from (4), §10, by putting 7=1. 
Using the law of the mean we then obtain the equation 


(2) Gn) = (ti, a1, [Pi — 11], 


where # is a number between #; and g:. From the extremal-convex nature 
of the boundary it follows that #:<q:. Thus this derivative (2) is negative. 

The directional derivative of J along the normal to the hyperplane 
v:=A(u;) at the point (a, a2,---, @,) is to be taken in the sense that 
leads out of (1), that is in the sense of decreasing 2;. This directional deriva- 
tive is thus equal to the left hand member of (2) multiplied by (—1), and is 


1928] 229 
; 1 

1 

1 

‘ (R) 1 


230 MARSTON MORSE [April 


accordingly positive. Similarly the directional derivative of J along a 
positive normal to any of the other hyperplanes bounding (1) may be seen 
to be positive. 

But we cannot as yet apply the lemma on critical points because the 
boundary of (1) is made up of portions of 2” hyperplanes instead of one 
analytic manifold. To meet this difficulty let O be any interior point of 
(1), and let straight line rays be drawn from O to each point of the boundary 
of (1). It follows from the results of the preceding paragraph that the 
directional derivative of J(v, - - - , ,), at a point P on the boundary of (1), 
taken along the ray joining O to P in the sense that leads from O to P is 
always positive. Now the domain (1) can be projectively transformed into 
an n-dimensional hypercube that lies in the space of (yi, - - - , Yn), and that 
is bounded by the — 1)-dimensional hyperplanes 


(3) 
This hypercube can be approximated to by the analytic manifold 
(4) yer + +--+ + = 1, 


where ¢ is a positive integer. In fact, if e be a positive constant, it is easy 
to show that for r sufficiently large, points on the above hypercube and the 
manifold (4) that lie on the same ray issuing from the origin will be within 
a distance e of one another. 

Let now the hypercube be projectively transformed back into (1), and 
suppose the manifold (4) goes into a manifold M. Let the point O from which 
rays were drawn in (1) be the image of the origin in the hypercube. The rays 
in (1) will each meet the manifold M in a single point. If r be sufficiently 
large, M will approximate the domain (1) so closely that the directional 
derivatives of J at points of M on the rays issuing from O in the sense that 
leads away from O, will all be positive. It follows that the outer normal direc- 
tional derivatives at points of M will be positive. 

A second requirement on M is that it approximate the domain (1) so 
closely that it contain in its interior all the critical points of J(u, - - - , n) 
that (1) contains. The manifold M will serve as the manifold 2 of Lemma 
1 of this section. 

The critical points of Lemma 1 are of rank nm. For the moment, then, we 
restrict ourselves to extremals joining A to B on which A is not conjugate 
to B. Reference to Theorem 2 of §6 and Lemma 1 of this section gives the 
lemma. 


Lemma 2. Let there be given regions S and S,, and integrand F, satisfying 
the hypotheses 1, II, III, and IV, of §7 and §8. In S, let there be given two 


4 

a 

4 

| | | 


1928] CALCULUS OF VARIATIONS IN THE LARGE 231 


points A and B which are joined by no extremals on which A is conjugate 
to B. Let the number of extremals joining A to B in S,; on which there are k 
conjugate points of A prior to B be denoted by M,. Let m be an integer equal 
to the maximum of these integers k. Then between the numbers M;, the relations 
(R) of Lemma 1 hold. 


12. A theorem in the large. We seek now to remove from Lemma 2, 
§11, the restriction that on no extremal joining A to Bin S; is A conjugate to 
B. Before doing this it will be necessary to recall certain results obtained 
by Lindeberg.* 

Let there be given in R an analytic extremal EZ, of the form 


v= E(u), csusd, 


joining A to B, and on which the point at which u=c has for its (m+1)st 
conjugate point the point at which w=d. Let a be the slope at A of any ex- 
tremal through A. In particular let a=ap be the slope of Ey) at A. That part 
of the envelope of the family of extremals through A which lies in the 
neighborhood of B, will not here consist merely of the point B. For ac- 
cording to the envelope theory this could only happen if all the extremals 
through A with slopes a near a» should pass through B, contrary to results 
in §9. 

' According to Lindeberg, the envelope T of the family of extremals 
passing through A, with a slope a near ap, will then be of the form 


v — E(u) = (a — ao)"*'K(a), > 0, K(ao) 0, 
u — d= (a— a)'H(a), H(ao) ¥ 0, 


where K(a) and H(qa) are analytic in a at a=ap, and r is a positive integer. 

Three classes of envelopes can be distinguished: 

CrassI_ ris odd; 

Crass II ris even and H(ao) <0; 

Crass III is even and >0. 

Crass I. Here the envelope T is tangent to the extremal E, at B, has no 
cusp there, and lies wholly on one side of Ey near B. If the class of extremals 
through A be restricted to extremals E for which la—ao| is sufficiently 
small, we can say that through each point P, not on 7, but sufficiently near 
B, and on the same side of T as Eo, there will pass just two extremals of the 
set E. On these two extremals the type number , that is, the number of 
points which are conjugate to A and prior to P, will equal m and m+1, 


* Lindeberg, Mathematische Annalen, vol. 59 (1904), p. 321. 


i 
4 


232 MARSTON MORSE [April 


respectively. Through any point P, not on 7, but sufficiently near B and 
on the opposite side of T from Zo, there will pass no extremals of the set EZ. 

Crass II. Here T is tangent to the extremal £, at B, but has a cusp there. 
On the envelope near B, u<d, except for the point B. The two branches of 
the cusp lie on opposite sides of Ey. A point P sufficiently near B, within the 
cusp, but not on 7, can be joined to A by three extremals neighboring Ep. 
On these three extremals the type numbers & will have the values m+1, 
m-+1, and m, respectively. Through any point P sufficiently near B, without 
the cusp, but not on 7, there passes just one extremal issuing from A and 
neighboring Eo. On this extremal k=m-+1. 

Crass III. Here T is tangent to the extremal EZ» at B, but has a cusp 
there. On the envelope near B, u>d, except for the point B. The two 
branches of the cusp lie on opposite sides of Eo. A point P, sufficiently near 
B, within the cusp, but not on T, can be joined to A by three extremals 
neighboring Eo. On these three extremals k has the values m, m, and m+1, 
respectively. Through each point P sufficiently near B, without the cusp, 
and not on 7, there passes just one extremal issuing from A and neigh- 
boring Eo. On this extremal k =m. 

In the following theorem there are a number of conventions to be adopted. 
Let g be an extremal joining A to B on which there are m points conjugate 
to A prior to B. If Bis not conjugate to A, g is to be counted as one extremal 
of type k=m. If B is conjugate to A, and g belongs to Class I, of the preceding 
classification, g is to be counted as two extremals, of types k=m-+1 and k=m 
respectively. If B is conjugate to A, and g belongs to Class II, or to Class III, 
then g is to be counted as one extremal of type k=m--1, or of type k=m, respec- 
tively. We can now prove the following theorem “in the large.” 


THEOREM 3. Let there be given regions S and S, and an integrand F satis- 
fying hypotheses I, II, III, and IV, of §7 and §8. Let A and B be any two 
points of S;. A first conclusion is that there are at most a finite number of ex- 
tremals g joining A and B and lying in S;. Let M, be the number of these 
extremals g of type k, counted according to the conventions preceding this 
theorem, and let m be the maximum of these integers k. Then between the 
numbers M;,, the relations (R) of Lemma 1, §11, hold true. 


That there are at most a finite number of extremals g joining A to B 
in S,, was proved in §9. If on no extremal g, A is conjugate to B, the remain- 
der of the theorem follows from Lemma 2, §11. 

If there are a number of extremals g on which A is conjugate to B, 
each of these extremals g will be tangent at B to an envelope T of extremals 
issuing from A and neighboring that g. Let L be a short straight line segment 


1928] CALCULUS OF VARIATIONS IN THE LARGE 233 


passing through B, lying in S;, and tangent to none of these envelopes. Any 
point P on L, not B, but sufficiently near B, will lie on none of these envelopes. 
Concerning the possibility of joining. A to P by extremals g’ in S;, we can 
state the following: 

Corresponding to any extremal g on which A is not conjugate to B 
there will exist one extremal g’ neighboring g, joining A to P, and of the 
same type as g. 

If g be an extremal on which A is conjugate to B, and g is of Class II, 
the point P if sufficiently near B, will lie without the corresponding cusp. 
According to the preceding description of Class II there will then be just 
one extremal g’ joining A to P and neighboring g. On g’, A will not be con- 
jugate to P, and the number of conjugate points of A prior to P will equal 
the type number k which we have agreed to assign to g. The facts are similar 
for extremals g of Class ITI. 

If g be any extremal of Class I, joining A to B, let T be the corresponding 
envelope neighboring B. We consider two cases. In Case I we suppose P 
lies on the same side of T as g. In this case A and P can be joined by two 
extremals neighboring g, on which A is not conjugate to P, and whose 
type numbers are the numbers we have agreed to assign to g. In Case II 
we suppose P lies on the opposite side of T from g. In this case there will 
exist no extremals neighboring g’ passing from A to P. Relative to this 
case we observe that if the relations (R) of Lemma 1, §11, hold between any 
given set of integers M,, these relations will also hold if we replace M; and 
by M;+1 and for any particular ({=1,2,---,m). Finally, 
if P be sufficiently near B, there will be no other extremals g’ joining A to P 
than those just enumerated. For if there were more, we could prove by a 
limiting process that there would be other extremals g joining A to B besides 
those first supposed to exist. 

The extremals which we have just proved join A and P are all extremals 
on which A is not conjugate to P. Concerning them Lemma 1 of §11 holds. 
If there are no extremals g joining A to B, of Class I, Case II, the theorem 
follows directly. But if there are extremals g of Class I, Case II, the relations 
of Lemma 1, §11, hold if all extremals g be counted except those of Class I, 
Case II. As we have already noted, the counting of extremals of Class I, 
Case II, as if they were in Class I, Case I, will not alter the validity of the 
relations (1) of §11, if these relations hold true prior to such a change. Thus 
the theorem is proved. 

Note. We could prove, by the methods of the paper on critical points, that 
extremals g of Class I could be omitted from the count entirely, and the relations 
(R), §11, still hold true. Such a proof would, however, lead us too far astray. 


if 


234 MARSTON MORSE [April 


Another important question is whether there are relations between the 
numbers M;,, other than those affirmed in the Theorem. For the case of 
functions in general, apart from the calculus of variations, the answer is that 
there are no other relations without further hypotheses. More specifically 
the author has proved that if there be given any set of integers M; satisfying 
the relations (R) of Lemma 1, §11, there exists a function f of class C’’ within 
and on a unit (n—1)-sphere, which on this (n—1)-sphere satisfies the boundary 
conditions of Lemma 1, and which possesses for each i, M; critical points of 
type i, and no other critical points of any sort.* 

For the case of the calculus of variations the author has shown that 
if under the hypotheses of Theorem 3 there is more than one extremal 
joining A and B, then there are at least two such extremals of minimum type. 
Except for this, Dr. Richmond, National Research Fellow at Harvard, has 
recently shown, for the case where m <3 in the Theorem, that an example can 
be set up in the calculus of variations corresponding to any set of integers M; 
satisfying the relations (R). 

13. Existence of extremal-convex boundaries. Let there be given in the 
(x, y) plane an open or closed curve y, without multiple points, and made up 
of a finite number of arcs of class C’. If y is closed, the points neighboring 
and not on y make up two disconnected regions. If y is open, the points 
neighboring , not on 7, and lying on short perpendiculars to the component 
arcs of y, slightly extended at the vertices, again make up two distinct 
regions. These two disconnected regions will be called the sides of vy. 

The curve y will be said to be extremal-convex relative to one of its sides S, 
if the angles at its vertices on the side S are between 0 and z, and if an extremal 
tangent to y at any point P has no other point than P in common with y or S in 
the neighborhood of P. 

Let g be any open or closed curve of class C’ and without multiple points. 
A curve g’ of class C’ will be said to lie arbitrarily near g, in position and direc- 
tion, if corresponding to an arbitrarily small positive constant e, a one-to-one 
continuous correspondence can be set up between g and g’, in such a fashion 
that corresponding points are within a distance e of one another, and direc- 
tion cosines of tangents at corresponding points differ by at most e. We 
can now prove the following lemma: 


Lemma 1. Let g be any extremal segment satisfying the hypothesis in §1, 
and in addition derived from a reversible problem (§7). 
(A) Then there can be found a curve g, of class C’’’, arbitrarily near to g 


* Morse, The analysis and analysis situs of regular n-spreads in (n+-r)-space, Proceedings of the 
National Academy of Sciences, vol. 13 (1927), pp. 813-817. 


1928] CALCULUS OF VARIATIONS IN THE LARGE 235 


in position and direction, lying wholly on an arbitrary side of g, and such that 
gi will be extremal-convex relative to the side of g: that does not contain g. 

(B) If g contains no conjugate point to its end points, then in addition to 
gi there can also be found a second curve g2 with the same properties as g:, 
except that gz will be extremal-convex relative to the side of g2 that contains g. 


To prove (A) we take the problem into the (u, v) plane as in §2, so that g 
becomes a segment of the u axis. The differential equation of the extremals 
near g can be put in the form 


(1) = A(u, v, v’) 


where A(u, v, v’) is of class C’”’ for (u, v) neighboring g, and any number v’ 
sufficiently small in absolute value. Alongside of (1) let us consider the 
differential equation 


(2) = A(u,v, + Mo, 
where M is a positive constant such that on g 
(3) M > A,(u, 0, 0). 


Now v=0 will represent a solution of (2), as well as of (1). The differential 
equation of the first variation, corresponding to a solution of (2), set up in 
particular for g, on which »=0, will be 


(4) = Ay(u, 0, 0)w’ + [A,(u, 0, 0) + 
Let us compare (4), by Sturm’s method, with 
(5) = A,(u, 0, 0)w’. 


Since (5) has a solution w=constant~0, (4) has no solution except w=0 
which vanishes twice. Hence there is no conjugate point on g to either end 
point of g, relative to the differential equation (2). 

Accordingly there exists a curve segment g,, which represents a solution 
of (2), which lies arbitrarily near g, and on which »>0. A comparison of the 
value of v’’, say v7’, given by (2) for a v and v on gi, with the value, say 
vi’, of v’’ given by (1) for the same v and v’, shows that vs’ >v/’. The proof 
is similar for the side of g where v <0. 

To prove (B), compare (1) with 


(6) v” = A(u, — 
where ¢ is a positive constant. The equations of first variation corresponding 


to a solution of (6), or a solution of (1), respectively, are identical when 
set up for g, that is for v=0. Accordingly there is no conjugate point on g 


236 MARSTON MORSE [April 


to any point on g, when g is considered as a solution of (6). Hence a curve 
segment g; exists that represents a solution of (6), that lies arbitrarily near g, 
and on which »>0. Part (B) follows on comparing the v’’ given by (6) 
with that given by (1) for the same v and 0’. Lemma 1 leads to the following 


lemma. 


Lemma 2. Let there be given in the region S of §1 a region S’ bounded by 
a simple closed curve y made up of a finite number of extremal segments g of §1, 
at least two in number, and making interior angles between 0 and wr. Suppose 
further that the problem is reversible. 

(A) Then each arc g may be replaced by an arc g; of class C’"’, arbitrarily 
near gi, within S’, and such that the set of arcs g, form a simple closed curve 
extremal-convex relative to its interior. 

(B) If the end points of each arc g have no conjugate points on that arc g, 
each arc g may be replaced by an arc g: of class C’’’, arbitrarily near go, exterior 
to S’, and such that the set of arcs g2 form a simple closed curve extremal-convex 
relative to its interior. 


Note: The proof shows that when the integrand is analytic the preceding 
curves g: and g2 may be taken as analytic. 

14. Anexample. Let a surface of revolution be defined by revolving 
an analytic open or closed curve G, without singularities, about an axis 
lying in a plane with G,, but not intersecting G. Let the surface be referred 
to parameters (u,v), of which w measures the angle through which G has been 
revolved from an initial position, and of which »v represents arc lengths 
measured along G from a point chosenonG. The meridians u=constant are 
all geodesics. The parallels » =v» are geodesics, if at the point » = the tan- 
gent to G is parallel to the axis of revolution. These facts follow at once from 
the differential equations of the geodesics. 

Let us represent the surface in the (u, v) plane. Let there be given in the 
(u, v) plane a rectangle S, bounded by any two curves u=constant, and by 
any two curves v=constant which represent geodesics. The rectangle S 
can be replaced by a region Sj, interior to the rectangle, but differing from 
the rectangle arbitrarily little, and bounded by a curve extremal-convex 
relative to its interior (Lemma 2, §13). The curves u=constant will form 
the field of extremals contemplated in §8, so that Theorem 3, §12, applies 
to S 1. 

If the curves v=constant bounding the rectangle S represent parallels 
on which the distance to the axis has a proper or improper minimum relative 
to neighboring parallels, then these curves v=constant have no conjugate 
points on them, as can be readily proved. The curves u=constant never 


*] 


1928] CALCULUS OF VARIATIONS IN THE LARGE 237 


have conjugate points on them. According to Lemma 2, §13, the rectangle S 
can in this case be replaced by a region S; slightly larger than S, bounded by a 
curve again extremal-convex relative'to its interior. To S; Theorem 3, §12, 
will then apply. In particular the torus presents several different types of 
regions S; to which Theorem 3, §12, applies. 


Part III. THE TYPE NUMBER OF A NON-DEGENERATE PERIODIC EXTREMAL 


15. The function J(v:,---, 0%). We start here with the same as- 
sumptions regarding the integrand F(x, y, x, j) as we made in §1. We suppose 
here, however, that we have given a periodic extremal g of length w and of 
class C’’’. We again suppose F(x, y, z, j) >0 along g. 

Let u be the arc length measured along g in the given sense. Let d be any 
positive constant less than the minimum distance between any two successive 
conjugate points on g. By an admissible integer n, and constants u, Ue, +: - , 
Un, Will be meant an integer »>1, and constants 4; increasing with their 
subscripts, all less than u,+w, and such that no one of the closed intervals 
on g consisting of points corresponding to a segment of the uw axis bounded 
by successive points of the set ue, Un, exceeds d in length. 
Let Mi, Me, - - - , A, be m short arcs of class C’’’, crossing g respectively at m, 
U2,***, Un, but not tangent to g, and let »v; be the arc length measured 
along h; as in §1. 


Let the point on h; at the distance v; from g be denoted by (uj, v;). If 2 
be sufficiently small in absolute value, the successive points of the set 


can be joined by unique extremals arbitrarily near segments of g between 
successive points on g at which wu takes on respectively the values ™, us, 

+ Un, w+. The value of the integral J taken along this succession 
of extremal segments will be again denoted by J(u, - - - , Un). 

16. The second partial derivatives of J(v:,---,%,). Asin §2, g can be 
mapped onto the wu axis in the (w, v) plane, with the additional fact that here 
the transformation from the (x, y) to the (u, v) plane can be taken as one 
in which x and y will be functions of u and » with a period w in wu. Points in 
the (u, v) plane whose u coérdinates differ by w will be termed congruent. 
The periodic extremal g will be represented in the (u, v) plane by any segment 
of the u axis of length w. The function f(u, v, v’) derived from F(x, y, %, 9) 
as in §3 will have a period w in u, as will the coefficients of the J. D. E. 
corresponding to the extremal v=0. As in §2, so here, the transformation 
from the (x, y) to the (u, v) plane may be made to preserve distances along 
g and the arcs h;, which become respectively in the (w, v) plane the curves 


238 MARSTON MORSE [April 


v=0 and u=u;. Thus the function - --, v,), set up in the preceding 
section, will here equal the value of the integral in the (wu, v) plane taken 
along extremal segments joining the successive points 


(1) (11, 01), (ue, V2), (tn, Vn), + 11) 


of the (u, v) plane. 

- +, %,) will have a critical point when (1, - - - , =(0,---, 0). 
To determine the nature of that critical point we shall examine the second 
partial derivatives of J. 

In §4 the points (wm, and (u,, ¥n) played a special because they 
were adjacent to the end points of the given extremal. Here a set of formulas 
giving the second partial derivatives should still hold after a circular permu- 
tation of the points (m, 2), - - - , (#n,0,). Such a circular permutation would 
be equivalent to advancing all the subscripts by the same integer, provided 
for any integer 7 we set 


(2) = + Vien = 


With (2) understood we now repeat the definitions and assertions of §4 
associated with (1), (2), (3), (4), (5) and (6) of §4. 

The values of J,, are again given by (7) of §4 subject not to (8) of §4, 
but to the limitations (3) as follows: 


(3) i= 1,2,°-+,®, Vo = Un, = 


Instead of (9) in §4 we have here 


(4) Jou; =0, #1, orn—1, 


subject also to (3). Equations (10), (11), and (12), of §4, hold here exactly 
as given in §4. From (7) of §4, as qualified by (3) above, we obtain finally 


v0; R( ten) Wn 
oS viv, = R( uy) ,o( 1). 


The two partial derivatives of J just given in (5) were given in §4 by (9) 
and were there zero; they are the only two second partial derivatives of J for 
which the formulas of this section differ from those of §4. 

17. Periodic extremals classified: non-degenerate, simply-degenerate, 
and doubly-degenerate. Before going further it is necessary to consider 
more in detail the J. D. E. as set up in §3 for the extremal v=0 in the (wu, v) 
plane. We note here however that P(u), Q(u), and R(u), have the period w 
in u. In terms of the solutions of this J. D. E. we distinguish three different 
kinds of periodic extremals E. 


(S) 


1928] CALCULUS OF VARIATIONS IN THE LARGE 239 


I. There are no solutions of the J. D. E. with the period w other than w(u) =0. 

II. There is a set of solutions with period w of the form Cw(u), where 
w(u) #0, and C is any constant, but no other solutions with period w. 

III. Every solution of the J. D. E. has the period w. 


In these three cases we shall say, respectively, that the periodic extremal 
is I, non-degenerate, II, simply-degenerate, and III, doubly-degenerate. 

Let p(u) and g(u) be two solutions of the J. D. E. that satisfy the initial 
conditions 


p(0) = 1, q(0) = 0, 
p’(0) = 0, = 1. 
Abel’s integral for these solutions becomes 
(2) R(u)[p(u)q’(u) — p’(u)q(u)] = constant. 
A substitution of u=0 and of u=w in (2), gives the well known result 
(3) p(w)q'(w) — p’(w)q(w) = 1. 


We state without proof the following known result. 


(1) 


The given periodic extremal will be non-degenerate, simply-degenerate, or 
doubly-degenerate, according as the rank of the matrix 


(4) 


is 2,1, or 0. 


| 
» g(w)-1 


18. Non-degenerate periodic extremal segments: convex, concave, or 
conjugate. Let us represent solutions w(u) of the J. D. E. as curves in the 
(u, w) plane. We will now prove the following lemma. 


Lemma 1. [If the given periodic extremal is non-degenerate, a necessary 
and sufficient condition that in the (u, w) plane every point (0, b) be capable of 
being joined to its congruent point (w, b) by a solution of the J. D. E., is that 
u=0 be not conjugate to u=w. 


Suppose a point (0, 5), not (0, 0), can be joined to the point (w, b) by a 
solution of the J. D. E. Such a solution will be of the form 


(1) bp(u) + cq(u), b <0, 


where c is a suitably chosen constant. Since the solution passes through 
(w, 6) we have 


(2) bp(w) + cq(w) = b. 


240 MARSTON MORSE [April 


Now if g(w) were zero, from (2) it would follow that p(w) =1, whence the 
rank of the matrix (4), §17, would be less than 2, contrary to the fact that 
the J. D. E. has no periodic solutions other than w(u) =0. Thus the condition 
q(w) #0 is a necessary consequence of the existence of solutions of the 
J. D. E. joining any point (0, 6) to its congruent point (w, 5). 

“To prove that if g(w) 0, any point (0, 5) can be joined to (w, b) by a 
solution of the J. D. E., say w(u, b), we exhibit such a solution, namely, 


(3) | tu) » gu) 
q(w)| pw)-1, ge) 


Thus the lemma is proved. 
From equation (3) we obtain the following: 
—b i, 
pw) , g@)-1 
Now the determinant in (4) is not zero if the given periodic extremal is non- 
degenerate. The usual methods of the calculus of variations serve with the 
aid of (4) to furnish a ready proof of the following lemma.* 


Lemma 2. If there be given a non-degenerate periodic extremal g on which 
the point u= up is not conjugate to u=uo+w, then in the (u,v) plane any point 
(uo, a), a0, neighboring (uo, 0), can be joined to (uo+w, a) by an extremal 
segment g’. If congruent points be regarded as identical, these extremal segments 
g’ will make an angle a with themselves at (uo, a) measured on the side of g’ 
towards g, which in magnitude will be either (Case 1) always less than 7m, or 
else (Case II) always greater than w. If uo=0, Cases I or II will occur according 
as the sign of 

, 
is positive or negative. 

The extremal segment g taken from u=wuUo to u=uo+w, will be said to be 
convex or concave according asa<m ora>m. We shall term M the test quotient. 


In case a point “=u on a non-degenerate periodic extremal g is con- 
jugate to its congruent point, we shall say that the segment of g from 
Uu=Uo to Up+w is a conjugate segment. It may happen in the case of a non- 
degenerate periodic extremal segment that every point is conjugate to its 
congruent point, as examples would show. In any case we see that a non- 


* Compare Hadamard, Legons sur le Calcul des Variations, Paris, 1910, pp. 434-435. 


1928] CALCULUS OF VARIATIONS IN THE LARGE 241 


degenerate periodic extremal segment from u=u to u=uo+w, is either 
convex, concave, or conjugate. 

19. Conjugate points on simply-degenerate and doubly-degenerate 
periodic extremals. We prove the following lemma. 


Lemma 1. Let there be given a simply-degeneraie, periodic extremal. Then 
if w(u) is any one of the 1-parameter family of periodic solutions of the J. D. E. 
which is not identically zero, the only points which are conjugate to their congruent 
points are points at which w(u) is zero. 


Suppose the lemma false. In particular suppose that «=a is a point con- 
jugate to uw=a+w, while w(a) #0. 

Let w,(u) be a solution of the J. D. E. which vanishes at a, but is not 
identically zero. Since a is conjugate to a+w we have 


(1) wi(a + w) = w,(a) = 0. 


Abel’s integral gives 


(2) R(u)[w(u) wi (u) — w’(u)wi(u) |] = constant. 


Upon successively substituting «=a, and u =a+w in this integral we obtain 
(3) wi(a+w) = wi(a). 


Because of (1) and (3) we infer that w:(u) is periodic. Since w:(u) and w(x) 
are linearly independent, every solution of the J. D. E. is periodic, contrary 
to the fact that we are dealing with the simply-degenerate case, and not the 
doubly-degenerate case. Thus the lemma is proved. 

Concerning conjugate points on doubly-degenerate periodic extremals, we 
say simply that every point is conjugate to its first congruent point. It would be 
a mistake to believe that this property is characteristic of doubly-degenerate 
periodic extremals, for it may occur, in particular, in the case of a non- 
degenerate periodic extremal, as examples would show. 

20. The rank and form of the matrix of second partial derivatives of J. 
We prove the following 


THEOREM 4. Corresponding to the given periodic extremal g of §15, the 
symmetric matrix a of elements 


ai; = oJ 


is of rank n,n —1, or n—2, according as g is non-degenerate, simply-degenerate, 
or doubly-degenerate. If g is non-degenerate a is always in normal form.* 


* Bécher, loc. cit., p. 59. 


242 MARSTON MORSE [April 


The case where g is non-degenerate. To prove the theorem in this case we 
turn again to the (u, w) plane of the J. D. E. of §18, and in that plane consider 
the points 


(1) (uo, Co), (m1, (un, Ca); 


of which (uo, co) is supposed congruent to (u,, c,). We join the successive 
points of (1) by curve segments representing solutions of the J. D. E., and 
denote the resulting curve by \. The curve X will represent a periodic 
solution of the J. D. E., if and only if the slope of \ at (uo, co) equals its 
slope at (u,, ¢,), and \ has no corners at the remaining points of (1). These 
conditions will be fulfilled if equations (5) of §5 are satisfied fori=1,2,---, 
n, and for co and ¢,4: respectively replaced by c, and c;. Let equations (5) 
of §5, so altered and understood, be denoted by (5a). The J. D. E. will 
have a periodic solution, not identically zero, if and only if equations (5a) 
are satisfied by a set of constants (ci, - - - , cn) not all zero. Such a periodic 
solution is possible if, and only if, the matrix of the coefficients of (c1,--- , 
cn) in (5a) is of rank less than m. With the aid of the results of §16 this matrix 
is seen to be identical with the matrix a of the present theorem, provided 
the factor R(u;) be removed, for each 7, from the ith row of a. Thus the 
J. D. E. has no periodic solution except w=0, if and only if the rank of a is n. 

That the matrix a is always arranged in normal form when g is a non- 
degenerate periodic extremal, can now be proved after the manner of proof 
that the matrix a of §5 is in normal form. 

The case where gis simply-degenerate. According to the proof already given 
the rank of a here is less than nm. It remains to prove that the rank of aism—1. 

Of the points (wm, 0),---, (un, 0) at least one is not conjugate to its 
congruent point, for otherwise it would follow from the lemma of §19 that 
the set of points (m,0), - - - , (u», 0) are mutually conjugate, contrary to the 
original choice of (w:, u2,---, n,n). We can then suppose, without loss of 
generality, that w=u, has been so chosen as not to be conjugate to 
u=U,—W= Up. 

Now the minor A,_;, obtained from a by striking out the last row and 
column, is one identical in form with the minor A,-,, considered in §5. 
According to Corollary 1, §5, A»-: will not be zero, since (uo, 0) is not conju- 
gate to (u,, 0). Thus in the simply-degenerate case the rank of a is n—1. 

The case where g is doubly-degenerate. As in the preceding case the rank 
of a is less than m. Further, every point on v=0 is here conjugate to its 
congruent point. In particular the point (uo, 0) is conjugate to its congruent 
point (#,,0). It follows from Corollary 1, §5, that A,.1.=0. But if we under- 
stand that u;+w=w;,, for each integer 7, then by advancing the subscripts 


1928] CALCULUS OF VARIATIONS IN THE LARGE 243 


we can bring it to pass that (w,, 0) is denoted by (w,, 0). In the matrix a the 
principal minor obtained by deleting the ith row and column becomes the 
principal minor A,-; when (u;, 0) becomes (u,, 0). Thus all the principal 
(n—1)-rowed minors of a are zero. But according to Corollary 1, §5, Ans 
is not zero, since (wo, 0) is conjugate to (u,, 0) and cannot at the same time 
be conjugate to (w,:, 0). Thus a is in this case of rank m—2, and the proof 
of the theorem is complete. 

21. The type number of a non-degenerate periodic extremal. We prove 
the following 


THEOREM 5. If the periodic extremal g of §15 is non-degenerate, the type 
number k of the corresponding critical point of the function J(v,- +--+, Un) of 
§15 will be independent of the choice of n among admissible integers n, and of the 
points (us, - Un) on g among admissible points (um, , Un), and may be 
determined as follows. Setting un—w=Uo, let m be the number of conjugate 
points to u=Uo, preceding u=u,. If is conjugate to u=u,, k=m+1. 
If u=uo is not conjugate to u=u,, then k =m, or m+-1, according as the segment 
of g from u=uo to u=u, is convex or concave (§18). 

The number k will be called the type number of g. 


As previously we concern ourselves here with the symmetric matrix a 
of which the elements are 


aij oJ <0; 


According to Theorem 4, §20, the matrix a in the case of a non-degenerate 
periodic extremal, is of rank m, and arranged in normal form. The type 
number k desired is then simply the number of changes in sign of the prin- 
cipal minors Ao, Ai, - - - , An (cf. §5, §6). To proceed further we agree again 
to set and =2%; for all integers 7. 

Now the matrix whose elements are those in A,-; would be identical 
with the matrix a of §5 and §6, if in §5 and §6 we were dealing with the 
segment of g for which wu lies between u =u, and u=u,, and if we had chosen 
n—1 intermediate points (1, --- , %n-1) instead of m. To proceed further 
we distinguish between two cases. 


Case I. The point u=wo is not conjugate to u=u,=Uo+w. In this case 
it follows from Theorem 2, §6, that the segment of g for which u lies between 
u=uy and u=u, is of type m, and that A,_1#0. Thus there are m changes of 
sign in the sequence Ao, Ai, - ~~, An-1. Hence in Case I the number & of 
the present theorem is m or m+1 according to whether or not A,-; and A, 
have the same sign. 

Now consider the m equations (5) of §5, altered as follows. We here 


244 MARSTON MORSE [April 


replace ¢o by Cn, Cn41 by C1, and the zero of the right hand member of the mth 
equation by 


(1) As-1/Aa» 
while finally we here multiply the left hand member of the 7th one of these 
equations ({=1, 2,---,m) by R(u:). We denote the resulting set of equa- 


tions by (Sb). Equations (5b) can be solved for (c:, - - - , Cn), giving in par- 
ticular for c, a positive value 

Api 

Az 


Cn = 


The interpretation of this solution is that there is a solution of the J. D. E. 
which in the (u, w) plane passes through the points 


(uo, Ca)» (11, C1), (ue, C2), (tn, Ca); 


whose slope at the point (#,, Cn) minus its slope at (uo, cn) equals the fraction 
(1) divided by R(u,). 

This difference of slopes, and hence the sign of (1), is readily seen to be 
positive or negative according as the segment of g taken from u=w, to 
u =u, is convex or concave in the sense of §18. Thus in Case I, k equals 
m or m+-1 according as the segment of g from u =u» to u =u, is convex or con- 
cave. 

Case II. The point u=wuo is conjugate to u=u,=uo+w. In this case 
A,-1=0, according to Corollary I, §5. Since w=» is conjugate to u=%p, 
it cannot be conjugate to u=u,_;. If then Theorem 2, §6, be applied to the 
segment of g from “=u to u=u,_1, we find that that segment is of type m, 
since there are m points conjugate to u» preceding u,-:. Hence there are m 
changes of sign in Ao, Ai, - ++, Ans. But according to the theory of regu- 
larly arranged quadratic forms, when A,1=0, A,-2 and A, have opposite 
signs. Thus there are m+1 changes of sign in Ao, Ai,---, An. Hence 
k=m-+1 in this case. 


We can now prove that the type number k is independent of the choise of n, 
and (U)=(u:, Un), among admissible integers n and points (U). 


The formulas of §16 and §4 show that the partial derivatives 
oJ 


are continuous functions of the position of admissible points (U). Now any 
admissible point (U) can be varied continuously through admissible points 
(U) into any other admissible point (U) for which ” is the same. During such 
a variation the terms of the sequence 


1928] CALCULUS OF VARIATIONS IN THE LARGE 245 
(2) Ao, Ai, Ay 


will vary continuously, and A, will never be zero. We can now prove that 
there can be no variation in the total count of changes of sign in (2). This 
will certainly be true if no member of the sequence (2) is ever zero. If 
A; becomes zero (where 7 cannot be zero or m) then A;_; and Aj, will have 
opposite signs at that stage of the variation.* Thus there can be no variation 
in the total count of changes of sign in (2), and the statement in italics is 
proved, except for possible changes of n. 

That permissible changes in m do not alter the number & follows from the 
fact that, if we hold mu» fast and introduce or remove any set of points u; 
so as to have an admissible set u; left, the number k, as determined under 
Cases I and II, will be the same before and after the change. Thus the 
theorem is completely proved. 


Part IV. THE TYPE-NUMBER OF A DEGENERATE PERIODIC EXTREMAL 


22. Simply-degenerate, isolated, analytic, periodic, extremals. By an 
isolated periodic extremal is understood a periodic extremal g, in the neighbor- 
hood of which there is no other periodic extremal with a length neighboring 
that of g. In this section and the two following we assume that we are dealing 
with a periodic extremal g given in the region S of §1. Concerning F(x, y, %, 9) 
“we make the same assumptions as in §1 together with the assumption that 
F(x, y, x, 9) be analytic for the values of (x, y, x, 9) admitted. The extremal 
g is to be without multiple points, isolated, simply-degenerate, and analytic. 
Along g, F(x, y, x, 9) ts to be positive. We denote g’s length by w. 

The region neighboring g can be mapped conformally on the region 
neighboring the u axis in the (w, v) plane in such a fashion that g corresponds 
to any segment of the wu axis of length w, and so that congruent points (w, v) 
and (u+w, v) correspond to the same point (x, y) in S, but that otherwise 
the transformation is one-to-one. As in §3 we can derive from F(x, y, x, 9) 
an integrand f(u, v, v’) corresponding to which v=0 is an extremal. According 
to the lemma of §19 only a finite number of points of g are conjugate to their 
congruent points in the simply-degenerate case. Let (u, v) =(0, 0) be one 
of the points of g not congruent to its conjugate point. As is well known, 
for a sufficiently small constant a0, any point (0, a) can be joined to its 
congruent point (w, a) by an extremal segment g’ neighboring g. Further, if 
congruent points be considered as identical these extremal segments g’ make an 
angle with themselves at (w, a) measured on the side of g’ toward g which in 
magnitude will now be shown to be either 


* Bécher, loc. cit., p. 147. 


246 MARSTON MORSE [April 


(I) Less than m for all vertices (w, a) on one side of g, and greater than x 
for (w, a) on the other side of g; 

(II) Less than x for all vertices (w, a) on either side of g; or 

(III) Greater than x for all vertices (w, a) on either side of g. 

Further, whether 1, I1, or III occurs depends only upon g and the choice 
of the point (u, v) =(0, 0). 


The family of extremals passing through the point (u, 7) =(0, a) with a 
slope b can be represented in the form 
(1) v = A(u, a, bd), Se, 
where A(u, a, b) is a real analytic function of (u, a, 6) for u on any finite 


segment of the u axis and for a corresponding positive constant e sufficiently 
small, where further 


(2) a = A(0, a, db), 
(3) = A,(0, a, bd). 


The condition that an extremal (a, 6) pass from a point (0, a) to (w, a) is 
that 


(4) A(w,a,b)-—-a=0. 
Now we have 
(5) Arla, 0, 0) ~ 0, 


since (0, 0) has been chosen so as not to be conjugate to (w, 0). Hence (4) 
possesses a solution 


(6) b= =0, 


where §(a) is analytic in a at a=0. The slope of an extremal (a, 6) given by 
(6) at (w, a), minus its slope at (0, a), is 


(7) A,(w, a, B(a)) B(a). 


This function is not identically zero in a. For otherwise extremals (a, b) 
given by (6) would make up a continuous family of periodic extremals, 
a case which has been barred. Hence we can write 


(8) A,(w, a, B(a)) — B(a) = A(a) = ark(a), k(0) #0, r>1, 
where k(a) is analytic in a at a=0. That 7 is an integer exceeding one is seen 
from the fact that 

1 A,(w,0,0)—1, Adr(w, 0, 0) 


(9) (0) Ax(w, 0, 0) Ay ,a(w, 0, 0) Au 0, 0) 


a 
is 
4 
£ 


1928] CALCULUS OF VARIATIONS IN THE LARGE 


In terms of p(u) and g(u) of §17 this gives 


1  @@) 


Now the determinant in (10) is always zero if the periodic extremal is 
degenerate (§17). Thus here A’(0)=0 and r>0. The statements in italics 
follow at once from (8). 

23. A simply-degenerate periodic extremal. Cases: concave-convex, 
concave, convex. A simply-degenerate, isolated, analytic, periodic, extremal 
g, taken from u=0 to u=w will be said to be concave-convex, convex, or con- 
cave, according as I, II, or III of the preceding section holds. We have previously 
given similar definitions for a non-degenerate periodic extremal (§18) in- 
volving the terms convex and concave. The concave-convex case did not 
present itself in the non-degenerate case. In the present case we state in terms 
of r and k(0) in (8) of the preceding section that the segment of g from u=0 
to u=w is 


(10) A’(0) = — 


(I) Concave-convex when r is even, 
(II) Convex when r is odd and k(0) >0, 
(III) Concave when r is odd and k(0) <0. 


24. The type number of a simply-degenerate periodic extremal. To define 
the type number of g we are going to modify the integrand f(w, v, v’) in such 
a fashion that we shall no longer have to do with a degenerate periodic ex- 
tremal. Let h(u) be any real analytic function of u with a period w, and u 
be a parameter. An integrand of the form 


(1) f(u, v, v’) + poh(u) 


reduces for 1 =0 to the original integrand. We will now prove the following 
theorem. 


THEOREM 6. Concerning the simply-degenerate periodic extremal g of §22, 
on which (0, 0) can and will be chosen so as not to be conjugate to (w, 0), we can 
say the following. It is possible to choose a function h(u) real and analytic in u 
with a period w, and values of u arbitrarily near p=0 in such a fashion, that if 
f(u, v, v’) be replaced by the integrand (1) then we have the following: 

(A) If g, taken from u=0 to u=w, is convex-concave, the modified problem 
will possess no extremals neighboring g of period w in u. 

(B:) If g, taken from u=0 to u=w, is convex or concave, the modified prob- 
lem will possess just one extremal E neighboring g of period win u. 


3 
; 
a 
4 
4 
4 


248 MARSTON MORSE [April 


(B.) The extremal E of (B:) will be non-degenerate, and if there are on g 
just m points conjugate to u=0 and preceding u=w, the type number of E 
as determined in Theorem 5, §21, will be m or m+-1 according as g, taken from 
u=0 lo u=w, is convex or concave. 


The choice of h(u). The extremals which correspond to the integrand (1) 
for values of u neighboring »=0, and which lie in the neighborhood of the 
extremal » =0, can be represented in the form 
(2) v= Bu, a, b, +b? +p Se, 
with 
(3) a = B(O, a, b, 

(4) b = B,(0, a, b, u), 
where B(u, a, b, u) is analytic in its arguments, for u on any closed interval, 


and ¢ a correspondingly sufficiently small positive constant. 
We will show that we can choose /(u) so that 


(5) B,(w, 0, 0, 0) #0, 

(6) 0, 0, 0) #0. 

Differentiation of the Euler equation with respect to yu will show that the 
function B,(u, 0, 0, 0), set equal to w(x), satisfies 

(7) Rw" + Rw’ + (’ — P)w = h(u) 

where R(u), P(u), and Q(u) are the functions already used in the J. D. E. 


set up for the extremal v=0, when np =0. 


It follows from (3) and (4) that we have initially 
(8) w(0) = B,(0, 0, 0, 0) = 0, 
w’(0) = B,,(0, 0, 0, 0) = 0. 


If we make use of the functions p(u) and g(u) of §17, it is readily seen that 

this solution w() of (7) is given by 

| 9) 
R(O)| p(u), 


g(t) 
0 R(0) p(w), q(w) 


dt. 


w(u) = B,(u, 0, 0,0) = f 


In particular 


(9) B,(w, 0, 0, 0) = dt, 
pO, 


dt. 
R(0)| p’(), 


0, 0,0) = f 
0 


> 


1928] CALCULUS OF VARIATIONS IN THE LARGE 249 


Now the coefficients of A(t) in the integrands of (9) and (10) are respec- 
tively 0 and Rj) for t=w. They are both positive for ¢ in an interval 


(11) w-—e<t<a, 


if e be a sufficiently small positive constant. Let 4,(u) be a function which 
is identically zero as ¢ ranges from ¢=0 to t=w, except that in the interval 
(11), »=4:(u) shall be equal to the v ordinate on a semicircle with end points 
at (w—e, 0) and (w, 0) and on which v>0. If in (9) and (10), 4(¢) be replaced 
by h(t), the resulting integrals would both be positive. Another fact of 
importance is that if e in (11) be a sufficiently small positive constant the 
corresponding function /,(¢) will not only make the integrals (9) and (10) 
positive but will cause the ratio of the integral (9) to the integral (10) to be 
arbitrarily small. Now /,(¢) can be approximated by a number of terms of 
a Fourier series with period w. The resulting function, which will serve as 
our definition of h(#), can be taken as a function approximating /,(t) so 
closely that for this choice of h(t) in (9) and (10) the ratio of B,(w, 0, 0, 0) to 
B.,(w, 0, 0, 0) will be arbitrarily small, and both B,(w, 0, 0, 0) and Bu,(w, 0, 
0, 0) will be positive. 

We can now settle the question of the existence of periodic extremals near 
the extremal v=0 corresponding to the integrand (1) when yp is near p=0. 
The conditions for a periodic extremal neighboring v=0 for » near p=0 are 


(12) a, b, 0, 
(13) Bi(w, a, b, —b=0. 


Now equation (12) reduces for 1=0 to equation (4) of §22. But (4) of §22 
is satisfied by the function )=8(a) of (6), §22. Hence (12) is satisfied by 
b=8(a) and Further 


(14) B,(w, 0, 0, 0) 0, 


since on the extremal v=0, the point (0, 0) was chosen not conjugate to 
(w, 0). Hence (12) admits a solution of the form 


(15) b= b(a, B(a) + uR(a, 
where R(a, yu) is analytic in a and yp, at a=0 and p=0. 
We are next concerned with solving (13) subject to (12), that is with 
solving 
(16) Bylo, a, b(a, — = 0. 
Using the fact that b(a, 0) =8(a) we obtain the identity 
By [w, a, b(a, 0), 0] b(a, 0) = A,[o, a, B(a) ] B(a), 


- 
3 
a 
4 


250 MARSTON MORSE 


which becomes with the aid of (8), §22, 

= a’k(a), k(0) 0. 
Hence (16) becomes | 
(17) By[w, a, b(a, — b(a, = ark(a) + wS(a, = 0, 


where S(a, u) is analytic at a=0 and w=0. We can solve (17) for u if 
S(0,0) #0. But S(0, 0) is the partial derivative with respect to yu, at (a, ») 
= (0,0), of the left hand member of (17). Thus 

B,(w, 0, 0, 0) 


18) S(0, 0) = [1 — Bu(w, 0, 0, 0 Byu(w, 0, 0, 0). 
(18) S(0, 0) = [ o(w (w ) 


Now as we have seen we can choose h(t) so that B,u.(w, 0, 0, 0) will be positive 
at the same time that the ratio of B,(w, 0, 0, 0) to B,u(w, 0, 0, 0) is arbitrarily 
small, so that the term B,.,(w, 0, 0, 0) will dominate the sign in (18). Thus 
S(0, 0) #0 for a proper choice of h(t). Hence, (17) admits a solution of the 
form 


19) d(0) #0, 


where d(a) is analytic at a=0. The solution of (12) and (13) is now given by 
(19), taken with (15). 

Final proof of (A). Now the integer r in (19) is the integer r in (8), §22. 
If the given periodic extremal is convex-concave, r is even. In this case let 
u be chosen arbitrarily small, different from zero, and opposite in sign to 
d(0) in (19). For this u, (19) will admit no real solution a neighboring a =0, 
and corresponding to this u the problem will possess no periodic extremal 
neighboring the extremal v=0. 

Proof of (B:). If the given periodic extremal is convex or concave, r is 
odd, and corresponding to any value of w say ui+0, neighboring n»=0, 
(19) gives a value of a, say a;, and (15) then a value of b, say bi, corresponding 
to which the modified problem for which u =; will possess a single periodic 
extremal E, with initial point (0, a,) and initial slope Jy. 

Proof of (Bz). The sign of the test quotient M of (5), §18, determined for E. 
To show that E is non-degenerate it will be sufficient to prove that the test 
quotient M of Lemma 2, §18, evaluated for E, is not zero. According to 
the results of §18, EZ, taken from u=0 to u=w, will be convex or concave 
according as M is positive or negative. By thus determining the sign of 
M we intend to prove that the non-degenerate extremal E is convex or con- 
cave according as the given simply-degenerate extremal »=0 is convex or 
concave, taking both extremals from u=0 to u=w. 


[April 
a] 


1928] CALCULUS OF VARIATIONS IN THE LARGE 251 


More generally the quotient M of §18 set up for any extremal with 
parameters (a, b, u) in (2), will be denoted by M(a, b, nu). We have 


—1)}B.-—1, B 


= M a, b, 
By Bua Bur 1 


(20) 
where in each of the partial derivatives we set u=w, but where (a, 6, uw) may 
take on any values neighboring (0, 0, 0). The form (20) shows that in terms 
of the function b(a, u) of (15) 


With the aid of (17) we therefore have 
(21) b(a, wu), = ra*-"k(a) + + wS.(a, 


We shal! evaluate (21) for values of uw given by (19), thereby evaluating 
M (a, 6, ») for values of (a, b, 1) which correspond to periodic extremals. Thus 


(22) M[a, b{a, u(a)} | ra’—*k(0) k(0) 0, 


where the terms omitted are of higher order than r—1 in a. 

Now according to the results of §23 the given simply-degenerate extremal 
v=0, taken from u=0 to u =w, is convex or concave when r is odd, and more 
particularly is convex or concave according as k(0) is positive or negative. 
It follows from (22) and (19) that if E corresponds to a parameter » +0, 
sufficiently small in absolute value, then E will be convex or concave ac- 
cording as the extremal v=0 is convex or concave (§ 18). 

Finally the type of the non-degenerate extremal E, determined according 
to Theorem 5, §21, equals the number m of points on E conjugate to u=0 
and preceding u=w, plus one when E is concave, and exactly m when E 
is convex. But this number m will equal the corresponding number deter- 
mined on the extremal »=0, provided only that u be sufficiently small in 
absolute value. Thus part (Bz) of the theorem is proved. 


In accordance with the results of the preceding theorem, the given simply- 
degenerate periodic extremal g, in case it is convex or concave, will be said to be 
equivalent in type to the non-degenerate extremal E whose existence is affirmed 
by the preceding theorem, and in case it is convex-concave will be said to be 
equivalent to a null set of extremals and to be neutral in type. 


25. A doubly-degenerate, isolated, analytic, periodic, extremal. We 
make the same assumptions here as we made in the case of a simply- 
degenerate periodic extremal (§22), except that here the given extremal g 


t 
} 
4 
: 
4 
% 


252 MARSTON MORSE [April 


is to be doubly-degenerate (§17). We will reduce the determination of the 
type of g to the cases already considered, that is, the non-degenerate and 
simply-degenerate cases. 


THEOREM 7. Concerning the preceding doubly-degenerate, isolated, periodic 
extremal g, and the corresponding simplified integrand f(u, v, v’) for which 
g becomes the extremal v=0, we can say the following. It is possible to choose a 
function h(u) analytic and periodic in u with the period w, such that the ex- 
tremals corresponding to the modified integrand 


f(u, v, v') + poh(u), 


for properly chosen values of the parameter u, neighboring p=0, will include 
at most a finite set o of extremals with the period w in u that lie in the neighbor- 
hood of »=0, and such that none of these extremals will be doubly-degenerate. 


To prove Theorem 7 we proceed as in the proof of Theorem 6, §24. We 
represent the extremals neighboring »=0 as in §24 and choose h(u) in the 
same way, so that we may regard equations (1) to (13) of §24 as holding here. 
At this point the two proofs diverge, since (14) of §24 does not hold here. 
In terms of the function A (w, a, b) of §22 of the unmodified problem, equations 
(12) and (13) of §24 become 
(1) Biw, a, b, uw) —a@=Alw, a, b) —a+uD(a, b, = 0, 

(2) Bylo, a, b, —b=A,(, a, b) —b+uE(a, b, = 0, 

where D(a, 6b, uw) and E(a, b, u) are analytic in (a, b, ») at (a, b, nu) = (0, 0,0). 
With the aid of (5) and (6), §24, we see that 

(3) D(0, 0, 0) = B,(w, 0, 0, 0) # 0, 

(4) E(0, 0, 0) = By (w, 0, 0, 0) ¥ 0. 

Because of (3) and (4), (1) and (2) can be solved for u. These solutions take 
the forms, respectively, 

(5) [A(w, a, b) a|G(a, b), 

(6) a= [Au(w, a, b) — b]H(a, b), 


where G(a, 6) and H(a, b) are analytic at (a, b) =(0, 0) and do not vanish 
there. To solve (5) and (6) simultaneously we are led to the equation 


(7) [A(w, a, 6) — a]G(a, b) = a, 6) — b)H(a, d). 


We distinguish between two cases: 
Case I: Equation (7) holds identically. 
Case II: Equation (7) does not hold identically. 


1928] CALCULUS OF VARIATIONS IN THE LARGE 


The proof in Case I. In this case neither of the differences 
A(w, a, b) — 4, A,(w, a,b) 


can vanish along any real analytic arcs in the (a, b) plane neighboring (a, b) 
=(0, 0). For if either of these two differences vanished along such real arcs, 
according to (7) the other difference would also so vanish, and the extremals 
for the unmodified problem »=0 would include a family of periodic ex- 
tremals contrary to the assumption that g is isolated. Except at (0, 0), 
the two members of (7) must then be of one sign, say positive, throughout 
the neighborhood of (a, 6) =(0, 0). If then uw be chosen negative and suffi- 
ciently near »=0, there will be no solution of (5) and (6) and hence no 
solution of (11) and (12). For each such choice of u the modified problem 
will possess no periodic extremals neighboring g. 

Proof in Case II. It may happen in this case that (7) possesses no real 
solutions neighboring (a, 6) =(0, 0) other than (0, 0). If this occurs, then 
for any choice of 10, sufficiently near » =0, (1) and (2) admit no real solu- 
tions (a, b, u) and the modified problem possesses no periodic extremals 
neighboring g. 

If, on the other hand, (7) possesses real solutions (a, 6) (0, 0), arbitrarily 
near (a, b) =(0, 0), these real solutions will make up a finite number of real 
analytic arcs representable in a one-to-one manner by function pairs of the 
form 


a = a(t), a(0) = 0, 

b= = 0, 
where a(¢) and b(#) are real analytic functions of ¢ at #=0, and are not both 
identically zero. The differences 
A [w, a(t), b(#)] a(t), 
A,[w, a(t), b(¢)] — 
cannot both be identically zero, for otherwise a(é), b(¢) would correspond 
in the unmodified problem to a continuous family of periodic extremals. 
On the other hand neither of the functions in (9) can be identically zero 
in ¢ without the other being identically zero in ¢, as follows from (7). We 
can then set 
(10) Afw, a(t), a(t) =#F(), FO)#0, s>0O, 
where F(#) is analytic at ¢=0. From (5) it follows that the values of u that 
go with (8) to give a solution of (1) and (2) are representable in the form 


(11) p(t) = K(0) 0, 


(8) 


(9) 


253 
4 

‘ 


254 MARSTON MORSE [April 


where K(t) is analytic at ¢=0. For any particular ¢, neighboring ¢=0, and 
corresponding values of a(t), b(#), and u(t), the corresponding extremal g’ 
will be periodic. 

Among the conditions that g’ be doubly-degenerate are that for the value 
of ¢ that gives g’ (§17), 


(12) Balw, a(t), b(t), u(t)) = 1, 
(13) By[w, a(t), b(t), u(t)] = 0. 


We proceed to show that (12) and (13) do not hold identically in ¢. For if 
(12) and (13) did hold identically in ¢, a performance of the following indi- 
cated differentiation would show that 


(14) a(t), b(t), wn] — af} =0, 


was an identity in ¢, where, as indicated, u is to be held fast during the 
differentiation and set equal to y(#) thereafter. If use be made of (1) and (10) 
we obtain the identity 


(15) Blw, a(t), b(¢), — a(t) = + (2), ul, 
and the partial derivative (14) becomes, upon using (11) and (15), 
(16) F(t) + F'(t) + + 


This function does not vanish identically since s>0 and F(0)~0. Hence 
(12) and (13) do not hold identically, and accordingly hold simultaneously 
for no value of #, neighboring ¢=0, other than ¢=0. 

Now to any real value of yu not zero, but sufficiently near zero, there will 
correspond, under (11), either no real value, one real value, or two real values 
of t, according to the evenness, or oddness of s, and the sign of K(0). To such 
a p there will then correspond, by virtue of (8), no periodic extremal, one 
periodic extremal, or two periodic extremals, and none of these extremals 
will be doubly-degenerate, since (12) and (13) will not both hold. 

Similarly there may arise a finite number of other periodic extremals 
from real solutions of (7) other than a(#), b(#), but in any case for a uO, 
and sufficiently near »=0, these periodic extremals will not be doubly- 
degenerate. Thus the theorem is proved. 


Let the set o of periodic extremals appearing in Theorem 7 be modified 
by replacing each simply-degenerate periodic extremal g, of o by an equivalent 
non-degenerate extremal, or a null set of extremals, according to the conventions 
at the end of §24. The set a; of non-degenerate extremals thereby obtained will be 


1928] CALCULUS OF VARIATIONS IN THE LARGE 255 


said to be equivalent in type to the given doubly-degenerate extremal g. If neither 
o nor a; contains any extremals, g will be said to be equivalent to a null set of 
extremals, and to be of neutral type. ' 


ParT V. RELATIONS IN THE LARGE BETWEEN PERIODIC EXTREMALS 


26. The integrand and regions S and S;. We make here again the as- 
sumptions I and II of §7 and §8, qualifying F and S. We replace III by III’, 
and IV by IV’ as follows: 


III’. Let there be given a closed region S,; consisting of points interior to S 
bounded by two simple closed curves B, and B2, of which Bz lies within B,. We 
suppose both B, and Bz consist of a finite number of analytic arcs without singu- 
larities and are extremal-convex relative to S, in the sense of III §8. 

IV’. We assume that we have in S a proper field of extremals representable 
in the form 


(1) x = h(u, 2), y = k(u, 2), 

where u is the parameter and v is the arc length measured along the extremals, 
and h(u, v) and k(u, v) have a period win u. We assume further that for any 
constant a and interval 

(2) asu<ata, 

the field (1) covers S, in a one-to-one manner, and that at each point (u, v) that cor- 
responds to a point (x, y) in S, the functions (1) are single-valued and analytic 
in u and v, and 

Nu, hy 
ku, 
The lemma of §8 will here be replaced by the following lemma of which the 
method of proof is very similar to that used in §8. No proof need be given. 


~ 0. 


Lemma 1. The region S, of the (x, y) plane will correspond under (1) to a 
region R of the (u, v) plane bounded by two unending arcs of the form 


v=A(u), A(u+w) = A(u), 
v= Bu), Biu+w) = Bu), A(u) < Blu), 
where A(u) and B(u) are of class C for all values of u, and analytic in u except 


for a finite set of values of u, on any interval of the form (2). The interior points 
of R are the points (u,v) such that 


(4) A(u) <0 < Bu). 


The correspondence between S; and any set of points of R limited as in (2) 
will be one-to-one. 


(3) 


| 
| 
4 
4 
a 
4 


256 MARSTON MORSE [April 


As in §9, so here, it follows that all extremals, except the extremals 
“=constant, are representable in the form 


(5) v= M(u), 


where M(x) is an analytic function of wu for all values of u that give points 
(u, v) in R. Further any extremal joining two points in R will have no 
points on the boundary of R, with the possible exception of its end points. 

We shall concern ourselves at first with periodic extremals continuously 
deformable in S;, in the (x, y) plane, into either one of the two closed boun- 
dary curves of S,, taken just once. In R, in the (w, v) plane, these extremals 
will be representable in the form (5), with M(u) possessing the period w. 
A point of difference between the developments for periodic extremals 
and those for extremals joining two fixed points A and B, is that in the 
latter case we were able to prove, under hypotheses I, II, III, and IV, 
of §7 and §8, that there were at most a finite number of extremals joining 
A to B, while in the case of periodic extremals the hypotheses I, II, III’, 
and IV’ are not sufficient to bar the existence of analytic families of periodic 
extremals lying in S; and deformable into a boundary of S;. Simple examples 
can be given to show the truth of this statement. However, cases where 
families of periodic extremals exist are certainly specialized in that the 
differential equation of the first variation, that is the J. D. E., set up for each 
member of such families, must possess a periodic solution not identically 
zero. Not only is this true but the differential equations of the variations 
of orders higher than the first must all possess periodic solutions. In excluding 
such families we are therefore excluding exceptional cases. We are the more 
justified in such exclusion by the fact that we are developing a theory that 
will serve to prove the existence of additional periodic extremals when a 
finite set of such extremals is given. If infinite sets of mutually deformable 
periodic extremals existed, then more than we hoped to prove would be 
granted. We state here the fundamental lemma which replaces Lemma 2 


of §11. 


Lemma 2. Let there be given regions S and S, satisfying I and II of §7, 
and III’ and IV’ of this section. In S; suppose there are at most a finite number 
of periodic extremals continuously deformable into either boundary of Si, 
taken just once, and that all of these extremals are non-degenerate (§17). Let 
the number of these extremals which are of type k(§21) be denoted by My. Let m 
be the maximum of the integers k. Then between these numbers M, the relations 
(R) of §11 hold. 


1928] CALCULUS OF VARIATIONS IN THE LARGE 257 


This lemma is proved with the aid of the fundamental lemma on critical 
points, §11,and Theorem 5, §21, in a manner similar to the manner of proof 
of Lemma 2, §11. ' 

27. The theorem in the large. In order to remove the restriction from 
the above lemma that the periodic extremals appearing there be non- 
degenerate we cannot rely on the envelope theory as in the case of extremals 
joining two fixed points. The following lemma will serve our purpose. In 
this lemma let the (w—1)-dimensional hypersphere with radius r and 
center (A) in the space of (1, - - - , ¥,) be denoted by S4.’. 


Lemma 1. Let there be given a function J(v:,-+~-, Un) of class C’”’ at 
each point (V)=(v,- +--+, of an open n-dimensional region Let (A) 
=(a;,-- +, an) be a point of at which J(u, - - , Un) has an isolated critical 
point. Let J(v,---, Un, ) be a function of class C’’’ for (V) in 2, and p 
in the neighborhood of 4=0, and such that in = 


Let e be a positive constant so small that J(1,--+, %n) has no critical 
point other than (A) within or on S4.*. Then for any fixed value of 4~0 and 
sufficiently small in absolute value, it is possible to replace J(%, +--+, Un) 
by a function $(n, - - +, Un) of class C’"’ throughout 2, and of such sort that in 
SA 


(2) $(01, Mm) Mm, 
while in 2 but outside of S4.* 


and finally in the closed domain between S4.%* and S4.*, b(11, +++, Un) has no 
critical points. 


To prove this lemma we are going to use a function h(x) that is of class 
C’’’ for all values of x, and is such that for the preceding constant e 


A(x)=1, |x| Se, 
h(x) =0, | x | = 2e. 


(4) 


Such a function as h(x) can readily be set up in terms of the elementary 
functions. 
For points (v1, - - - , of the closed domain between and S4.* set 


t 

4 3 

| 

| 


258 MARSTON MORSE 


(5) Un, = %, — J(01, » Mn), 


and now define ¢ for the same points (2, - - - , ¥,) as follows: 


Now the constant » can be chosen so small in absolute value that the function 
D of (6), as well as each of the partial derivatives D,,, is less in absolute 
value than a preassigned positive constant d throughout the whole region 
between S4.%* and S4.*. For the same domain the sum of the squares of the 
first partial derivatives of J(u, - --, %,) exceeds some positive constant. 
It follows readily from (6), that for a 40, sufficiently small in absolute 
value, the function ¢ of (6) will have no critical point between S4. and 
S4.*. We accordingly understand such a value of uw chosen, and hereafter 
held fast. 

If now the function $(u, - - - , ¥,) be defined at the remaining points of 
> by (2) and (3), it is readily seen that (m1, ---, 2.) has the properties 
affirmed in the lemma. 

We are now in a position to state the following theorem. 


THEOREM 8. Let there be given regions S and S, and integrand F(x, y, , 4) 
satisfying I, II, II’, IV’ of §7 and §26. In S; suppose there are at most a finite 
set T, of periodic extremals g, continuously deformable into a boundary curve of 
S:. In the set T let each degenerate extremal be replaced by an equivalent set 
of non-degenerate extremals in accordance with the conventions at the ends of 
§24 and §25. Let M, be the number of periodic extremals of type k in the resulting 
set, and m be the maximum of the numbers k. Between the numbers M, the re- 
lations (R) of §11 hold. 


If the above periodic extremals g are all non-degenerate, the present 
theorem is identical with Lemma 2, §26. 

Suppose on the other hand that not all of the above periodic extremals 
g are non-degenerate. To be specific, suppose the above set includes a 
simply-degenerate periodic extremal g, which taken from u=0 to u=w 
is convex or concave. Then according to Theorem 6 of §24 it will be possible 
to modify the integrand by the introduction of a parameter yw in such a 
fashion, that for a suitably chosen value of wu, say yw, arbitrarily small in 
absolute value, there will appear, instead of g,, in the neighborhood of g:, 
a non-degenerate extremal E of what we have agreed to take as the type 
equivalent to g:. Corresponding to this introduction of a parameter yp into 
the integrand, the function J(2, - - - , ¥,) whose critical points correspond 
to the given periodic extremals will be replaced by a function J(m, -- -, 


[April | 


1928] CALCULUS OF VARIATIONS IN THE LARGE 259 


Yn, w). In setting up J(m, - - - , 2x, w) we can and will use the same field of 
extremals (IV’, §26) and corresponding parametric system (u, v) as in 
setting up J(m,---,%,). The broken extremals along which J(, - - - , 
Yn, wi) is to be the integral will, however, be taken as the extremals corre- 
sponding to the integrand in which y=y;. Let (A)=(a,---, a,) be the 
point at which the unmodified function J(m,---, v,) has the isolated 
critical point corresponding to g;. By virtue of our modification of the 
integrand of the problem, J(%, - - - , 2n, #:) will have, in the neighborhood 
of the point (A), a critical point (B) of the type termed equivalent to that 
of £1. 

It follows from the lemma of this section that we can set up a function 
(v1, ---, ) that will have in a suitably chosen neighborhood of (A) 
no other critical point than (B), and that will be identical with J(v, - - - , ¥,) 
without this neighborhood of (A). 

We can similarly modify J(2, - - - , vn) in the neighborhood of each other 
critical point (A) that corresponds to a simply-degenerate periodic extremal 
g:, so that the modified function has in the neighborhood of (A) a critical 
point (B) of a type equivalent to that of gi, in case g, is convex or concave, 
or has no critical point at all in the neighborhood of (A) in case g; is convex- 
concave, while except for the neighborhoods of these points (A), the modified 
function is identical with the original function J(m, - - - , 0,). 

In case the set T includes doubly-degenerate periodic extremals, it 
follows from Theorem 7, §25, and Lemma 1 of this section, that we can first 
modify J(v, - - - , Yn) in such a fashion that the resulting function no longer 
has critical points corresponding to doubly-degenerate extremals. We can 
then, by additional successive modifications, obtain finally a function re- 
placing J(u, - - - , dn) whose critical points all correspond to non-degenerate 
extremals, including thereby all of the non-degenerate extremals of the 
original set 7, together with the complete set of non-degenerate extremals 
equivalent to the degenerate extremals of T. The theorem then follows 
upon applying the lemma on critical points of §11 to the function finally 
evolved from - - - , 


Part VI. DEFORMATION THEORY 


28. Families of curves joining A to B. In this part of the paper we shall 
show how the type of a given extremal segment can be characterized in 
terms of the possibility or impossibility of making certain deformations of 
families of curves joining A to B. The methods will be extended to the case 
of closed extremals in a later section. The results obtained, it is hoped by 


| 

a 

q 


260 MARSTON MORSE [April 


the author, will serve as a part of a necessary basis for a “theory in the 
large” more extensive than any already developed. 


By an m-family of curves, Zm, will be understood a set of ordinary curves 
(B, p. 192) lying in the (x, y) plane and passing from an initial point A to a 
final point B; where, further, the set of all points of Zm make up a single-valued 
continuous point function of the points on a product complex Cm obtained by 
combining an arbitrary point t on a closed interval of the t axis with an arbi- 
trary point P on some m-dimensional manifold* M,,; and where, finally, the 
dependence of the points of Zm upon the points of Cm41 is such that to hold P 
fast and vary t gives a representation of the individual curves of Zm by virtue 
of which they may be termed ordinary. 


The point P will be called the parametric point and the manifold M,, the 
parametric manifold. 

Let there be given two families Z,, and Z,’ consisting of curves which 
join the same two points A and B. If the parametric manifolds of Z, 
and Z,’ are homeomorphic the two m-families can be represented by the 
aid of the parametric manifold of either Z, or Zm’, in particular, say, by 
the manifold M,,. In such a case Zn and Zm' will be said to be mutually 
deformable if corresponding to each value of a parameter pw on the interval 


there exists an m-family Zm(u) of which Zn(O) is Zn and Zm(1) is Zn’, 
while each m-family Zn(u) is representable in terms of the same parametric 
manifold Mm, and the same interval for t, and joins the same two points, A 
and B; and if further the complete set of points (x, y) on these m-families, 
Zm(u), by virtue of their dependence upon p, t and P, make up a single-valued 
continuous point function of an arbitrary point on the product complex C42 
obtained by combining an arbitrary point p on its interval, an arbitrary point 
t on its interval, and an arbitrary point P on Mn. 


In a deformation D such as the one just defined Z,, will be called the 
initial family and Z,.’ the final family. The m-families Z,,(u), for values of 
uw between 0 and 1, will be called the intermediate families. A point P=P, 
on M,,, and a value 4=po held fast while ¢ varies, determine a curve in 
Zm(uo). The curve P =P, of Z,, will be said to be replaced in the deformation 
D when p=po, by the curve P=P, of Zm(uo). Points on a curve P of Z,i, 
and points on any curve replacing P under D will be said to correspond 
if they are given by the same values of ¢. If we should hold P and ¢ fast 


* Cf. Veblen, The Cambridge Colloquium, 1916, Part II, Analysis Situs, p. 88. 


1928] CALCULUS OF VARIATIONS IN THE LARGE 261 


in Z,(u), and vary yp, the resulting set of points would be the locus of points 
corresponding under D to a single point of Z,, . 

We hereby understand that all of the definitions of this section have 
been given in terms of (, v) as well as of (x, y). 

29. Hypotheses. Fundamental lemmas on deformations. We make here 
concerning F(x, y, x, 7), and the given extremal g, the same hypothesis 
as in §1, except that here we suppose that F and g are of class C’’”’ instead 
of class C’’’, and that F(x, y, x, 9) is positive, not only along g, but also for 
(x, y) on g and for (%, 7) any two numbers not both zero. A consequence of 
the assumption that F be of class C’’”’ instead of class C’’’, is that the 
function %,) set up in §1 is here of class C’”’ for (1, - ++ , 
in the neighborhood of (0,---, 0). As in §2 we transfer the problem to 
the (u, v) plane, carrying g into a segment ¥ of the u axis. 

By a canonical curve will be understood a succession of extremal segments 
joining the successive points of the set 


(1) (uo, 0) 0), (tn, Un) (Un+1, 0), 


determined as in §1. An m-family of canonical curves will be called a 
canonical m-family. 

Let z stand for any positive constant a, b, c, d, e, etc. Let R, denote the 
set of points (u, v) within a distance z of . 


Lemma 1. Let R; be any region in the (u,v) plane enclosing y in its interior 
and in which the problem is “regular.” If a be a sufficiently small positive 
constant, the region R,, consisting of the points (u, v) within a distance a of y, 
will possess the following property. Any m-family Z, consisting of curves that 
lie in R., join y’s end points, and give to J a value such that 


(2) JsJo— 


where e is a positive constant, and Jo is the value of J along y, can be deformed 
within R,, through the mediation of curves that always satisfy (2), into an m- 
family of canonical curves. This deformation can be so made that if any of the 
curves of Zm are canonical curves they are replaced in the deformation only by 
curves along which v is a function of u of at least class C. 


Before coming to the proof proper we make a number of preliminary state- 
ments and definitions. 

(A) We can and will choose a positive constant r so small that of the 
intervals J;: 


(3) u—rsusuztr ; -++,n+1), 


+ 

j 

x 

i 


262 MARSTON MORSE [April 


no two successive intervals J; and J;,, have any points in common or contain 
any conjugate points of each other. 
(B) If in the results obtained by Lindeberg* in a paper cited below we set 


G(x, y, x’, y’) = + 


we readily obtain the following. Corresponding to the positive constant r 
just chosen in (A) there can be found a region R, so small that if 7 be any 
“ordinary curve” (B, p. 192) joining y’s end points within R, and satisfying 
(2), if w and a@ are u coérdinates of points Q and Q on 7 and ¥ respectively, 
and if Q and Q lie at the same distance measured along 7 and y respectively 
from (uo, 0) or from (un4:, 0), then 


| — u| 
(C) We will now choose a region R, as follows. We first require the region 
R. to lie within the region R, of the preceding paragraph (B). Further we 


can and will choose ¢ so small (B, pp. 275 and 307) that any two distinct 
points Q and Q’ which both lie in one of the regions 


(4) (¢=1,2,--+,n+1) 


can be joined in the order Q, Q’ by an extremal segment E with the following 
properties: 
(a) The extremal E£ gives a minimum to J relative to all “ordinary curves” 


joining Q to Q’ and lying in-(4). 

(b) The coédrdinates (u, v) of points of E are functions of at least class 
C’ of the codérdinates of Q and Q’ and of the distance s of the points (wu, v) 
from Q measured along E. 

(c) In case Q and Q’ lie in two successive intervals 7; and Iis; of (A), 
the coérdinate v of a point (u, v) on E will also be a function of class C’ 
of u as well as of the codrdinates of Q and Q’. 

(D) Finally we choose a region Ra which we will prove can serve as the 
region R, of the lemma. We first require that d be so small that if the end 
points of any of the extremal segments E described in (C) lie in Rz the whole 
of E will lie in R.. A second requirement on d will be added later. 

Now let Z,, be any m-family each curve of which lies within R, and satis- 
fies the relation (2). In accordance with the conventions of §28, let P be the 
parametric point of the family Z,,, and M,, the parametric manifold on which 
P lies. We come now to the deformation of Z,, into an m-family of canonical 
curves. This deformation will be given as the resultant of two deformations, 


* J. W. Lindeberg, Uber einige Fragen der Variationsrechnung, Mathematische Annalen, vol. 
67 (1909), p. 351, §8. 


1928] CALCULUS OF VARIATIONS IN THE LARGE 263 


D’ and D’’, which obviously can be combined into a single deformation 
if the final m-family of D’ is identical with the initial m-family of D’’. 

Let each curve of Z,, be divided into +1 successive segments of which 
the ith, say gi, consists of points for which the arc length s measured from 
(uo, 0) satisfies 

Mi1SSEU; 1 +++, m), 


and when i=n+1 consists of the remaining points on the given curve. In 
accordance with the preceding results, (A) and (B), the coérdinates of the 
points of g; will satisfy 


(5) -++,n+1). 


The deformation D’ will be defined as follows: For any value of the de- 
forming parameter yp for which 


let each arc g, on each curve of Z,, be divided into two successive segments 
whose arc lengths are in the ratio of » to 1—y. Corresponding to the given 
value of u let the second of these two segments of g; be replaced by itself but 
let the first of these two segments, say k;, be replaced by that extremal 
segment, say /;, that belongs to the class of extremals described in (C) and 
that joins k,’s end points. Points of 4; and k; which divide h; and k; respec- 
tively in the same ratio as measured by arc lengths, shall be made to cor- 
respond, and shall accordingly be assigned the same values of ¢, namely the 
values already assigned to the points of k; in the representation of Z, 
in terms of ¢ and P. 

That we have here actually defined a deformation of Z,, readily follows 
from (C). Let Z,, denote the final m-family of the deformation D’. According 
to paragraph (D) the curves of Z,, will all lie in R., and according to para- 
graph (C), (c), their » coérdinates will be functions of their u coérdinates 
of class C at least. Further, the curves used in the deformation D’ give 
to J at most the value which the corresponding curves of Z,, give, so that 
the curves used in D’ satisfy (2). 

We can now define a second deformation D’’ under which Z,; will be 
deformed into a family of canonical curves satisfying (2). The definition 
of D”’ is similar to that of D’ except that the segments g; into which we divide 
the curves of Z,, can here be defined in terms of wu instead of s as follows: 


M1S US (¢=1,2,---,"+1). 


The combination of the deformations D’ and D’’ will obviously give a 
deformation of the desired sort and the lemma is proved except for its 


q 
¢ ; 
& 
€ 
“4 
4 


264 MARSTON MORSE [April 


concluding sentence. This concluding sentence is also readily seen to be 
true if the constant d of (D) be further restricted in magnitude so that the 
slopes of the canonical curves in Z,, will be less in absolute value than a 
sufficiently small positive constant. 

In the deformation D’’ a curve which initially was canonical is for each 
value of the deforming parameter yp replaced by itself. If in the original 
m-family the variable ¢ had been wu, then of the deformations D’ and D’’ 
only D’’ would have been needed. We thus have the following lemma. 


Lemma 2. If in Lemma 1 an m-family Z» is given in which the variable 
t=u, then the deformation whose existence is affirmed in Lemma 1 can be de- 
fined in such a fashion that any curve of Zm which is a canonical curve remains 
unaltered throughout the deformation. 


The following Lemma is important. 


Lemma 3. Let R, be a region of the (u,v) plane enclosing y in its interior, 
and in which the problem is regular. If e’ be any sufficiently small positive 
constant, the region R.» consisting of points (u, v) within a distance e’ of y 
has the following property. If a canonical m-family Zn can be deformed within 
R.- into a canonical m-family Z»i' through a deformation D all of whose curves 
give to J a value such that 


(6) é, 


where e is a positive constant, and J the value of J along y, then Zn can be 
deformed into Zm’ , within R,, through the mediation of curves all of which are 
canonical and which satisfy (6). 


For a moment suppose the region R; given in Lemma 1 is the region Ri 
given in the present lemma. Corresponding to this choice of R; let R. be 
a particular choice of the region R, such that Lemmas 1 and 2 hold true. 
Now let us apply Lemma 1 again, this time taking for the region R; of 
Lemma 1 the region R, just determined. Corresponding to this second 
choice of R;, Lemma 1 will hold true for a second proper choice of R, which 
we now denote by R.. We will prove Lemma 3 is true if R,- in Lemma 3 
be taken as R,. 

We first note that the complete set of curves used in the deformation D 
can be considered as an (m+1)-family Zn4:. For if M is the common para- 
metric manifold of Z,, and Z,,’’, each curve of D is specified by a pair 
(P, »), where P is on M,, and yp on its interval (0, 1). These curves of D 
can then be regarded as specified by a point P; = (P, yu) the totality of which 
points form an (m+1)-dimensional manifold M m41. 


1928] CALCULUS OF VARIATIONS IN THE LARGE 265 


Now the (m+1)-family Z,,4: can be deformed subject to (6), and within 
R.’, into a canonical (m+1)-family Z,4: using thereby the deformation, 
say D, of Lemma 1. Under D the given canonical m-families Z,/ and Z,!’ 
appear initially in Zn4:, and will be replaced finally in Z,4: by canonical 
families, say Z,/ and Z,’ respectively. We will prove the lemma by showing 
how each of the first three canonical m-families of the set 


can be deformed, subject to (6), into its successor in (7) by a deformation 
in which only canonical curves are used. 

(a) The deformation of Zn into Z,. We may fix our attention upon 
those curves of D which serve to deform Z,’ into Z,/ and let Z,4: be the 
(m+1)-family of curves in D replacing , including also Z,’ and 

According to the concluding sentence of Lemma 1, Z,/,; consists of curves 
along which 2 is a function of u of class C. According to Lemma 2, Zm41 
can then be carried by a deformation D’ which does not alter Z,/ and Z,/ 
into a set of canonical curves. The final canonical (m-+1)-family of curves 
of D’ may be considered as the curves of the desired deformation, say D,, 
of into Z,. 

(b) The deformation of Zn into Zn’. The deformation, say D2, required 
here is furnished by the set of canonical curves which make up Zm41. 

(c) The deformation of Zn’ into Zn’. This deformation, say D3, is set 
up as D, was set up in (a). 

The combination of D,, D2, and D;, in the order written, will give the 
required deformation of Z, into Zm’ and the lemma is proved. 

30. The analysis situs of the deformation problem. We have been denot- 
ing by J(u, - - - , 2) the value of the integral J along the canonical curves 
joining the points (1) of §29. If (mo, 0) is not conjugate to (#n41, 0) but if 
there are k conjugate points of (wo, 0) preceding (#n4:, 0), it follows from 
Theorem 2 of §6 that J(m, - - - , 2,) has a critical point of rank m and type 
k at the point (1, ---,%,)=(0,---,0). It follows from a lemma proved 
by the author* that there exists a one-to-one continuous transformation of 
the variables - - - , into variables (y:, - - -, yn) in which the point 
(v1, - ++, Mn) =(0,---, 0) corresponds to ---, yn) =(0,---, 0), and 
under which 


(1) %) — JO, ---, 0) 


* Marston Morse, these Transactions, loc. cit., p. 354. 


i 
f 


266 MARSTON MORSE [April 


for (y:, -- +, Yn) in the neighborhood of (0,---, 0). We exclude for the 
present the case where k=0. 
For the sake of brevity we set 


(2) t---+y2, O0<ksn, 
(3) q? = yer t+ 


understanding g? to be identically zero if k=n. Now let a be a positive con- 
stant so small that the domain 


(4) a 


is one for which (1) holds as described. 

In the present section we shall concern ourselves with deformations, 
in the neighborhood of the extremal segment , of m-families Z,, composed 
of canonical curves subject to a condition of the form 


(5) SI(0, ---, 0) — e<a, 


where ¢ is an arbitrarily small positive constant. We shall confine ourselves 
to m-families corresponding to which the parametric manifold M,, is closed. 
Corresponding to each point P on M,, there is a curve on the given m-family 
Zm-. In the present case each such curve is canonical, and the points at which 
it crosses the straight lines w=u; have for their v codrdinates v;. Thus 
each point P on M,, determines a point (m1, - - - , 2.) =(V). If we regard the 
points (V) as distinct when they arise from distinct points P on Mn, the set 
of all such points (V) form a closed m-dimensional manifold M, homeomor- 
phic with the manifold M,,. If the given canonical m-family be deformed 
in the sense of §28, the manifold M,,, arising at each stage of the deformation 
from the m-family which replaces Z,, at that stage, will itself be “homotopic- 
ally deformed” in the sense of analysis situs.* 

We turn next to the space of the points (Y) =(y:, - - - , yn) and confine 
ourselves to a spherical neighborhood of the origin in the form (4), and to 
points in the space of the points (V) that correspond to the neighborhood 
in the space of the points (Y). Corresponding to a deformation of M,, 
in the space of the points (V) we shall have to deal with a deformation of a 
corresponding manifold M,,;’ in the space of the points (Y). Our require- 
ment that the curves of the given m-family satisfy (5) becomes in terms of 
the points (Y) on the manifold M,,’ the condition 


(6) O<ken. 


* Veblen, loc. cit., pp. 125-126. 


1928] CALCULUS OF VARIATIONS IN THE LARGE 267 


31. Homotopic deformations in the space (Y). The preceding condition 
(6) and the restriction of the points (Y) to points in a spherical neighbor- 
hood of (Y) =(0, - - - , 0) combine to give us the relations 


+ Ss a’, 


(1) 
O<ksn. 


The relations (1) define a finite closed region in the space of the points (Y). 
The deformation of m-families thus leads us to study deformations of 
closed manifolds in the space (1). 

We could get some but not all of the properties of the space (1) by 
breaking it up into cells, and showing that of its connectivity numbers 


Ro, Ri, Re, 


all are unity except one, namely 
Ry-1 = 2 


Note that the (k—1)-dimensional sphere 
(2) p? = g=0 


is among the points of (1). Denote this sphere by S,1.. A very important 
fact is that the region (1) can be deformed through the mediation of its own 
points into a singular complex on S;-:. In terms of a deforming parameter 
u the set of intermediate and final positions, (y, - - - , yn), corresponding to 
any initial point (a;, - - - , dn) of (1), in a deformation D of (1) into a complex 
on S;_; can be given as follows: 


af+---+a? 
y? = (1 — (¢=k+1,k+2,---,m), 


(3) y? = (1 — wa? + 


where 
(4) 


and where the codrdinates (y:, - - - , yn) are respectively required to have the 
sign of the codrdinates (a:, - - - , an) from which they are deformed. 

We can now prove that any m-dimensional closed manifold Mm, singular 
or non-singular, lying on (1) and such that m is less than k—1, can be deformed 
on (1) into a point. For under the above deformation (3), M,, can be de- 
formed into a manifold M,; on S;-:. Regardless of whether M,, is singular 
or non-singular on S;_:, there is a fundamental theorem on homotopic 
deformations* to the effect that M,,, can be deformed on S;_, into a manifold 


* Veblen, loc. cit., p. 131. 


- 

t 

| 

+ 

j 

| 

| 

# 


268 MARSTON MORSE [April 


M,.’ on S,-: each of whose cells “covers” a cell of S,_;. Hence M,/’ can be 
deformed into a point on S;_:; and the statement in italics follows at once. 

The sphere S;,-1 cannot itself be deformed on (1) into a point on (1). For 
if there were such a deformation, say D’, let the family of intermediate mani- 
folds homeomorphic with S;,-: under D’ be deformed onto S,_; under the 
deformation (3). The resulting family of manifolds on S;_; would con- 
stitute a deformation on S;-; in which the initial manifold would cover 
S,—1 just once and the final manifold coincide with a point. This, however, 
is impossible according to a fundamental theorem of analysis situs.* 

We can now add the statement that S;,_: cannot be deformed on (1) into 
any manifold on a manifold M, on (1) of lower dimension than k—1. For 
according to the preceding paragraph, M,, and hence S;_1, could be deformed 
on (1) into a point on (1). By making use of product manifolds analogous to 
surfaces of revolution in 3-space it is easy to set up for each m>k—1 an 
example of a manifold on (1) which cannot be deformed on (1) into any 
manifold of dimensionality less than k—1. 

Finally with the aid of the deformation (3) we obtain the following result. 
Any closed manifold M,, on (1) for which m=k—1 can always be deformed on 
(1) into a manifold on Sy-1, and in special cases can be further deformed on 
into a point. 

32. The theorem on deformations of m-families joining A to B. We 
can now translate the results of the preceding section into terms of defor- 
mations of canonical m-families. We shall understand that an m-family 
shall be considered as “on” an r-family if every curve of the m-family coincides 
with some curve of the r-family. Let it also be understood that a closed m- 
family shall mean an m-family whose parametric manifold is closed. In 
particular it should be noted that a closed 0-family means a pair of ordinary 
curves joining the same two points. The results of the preceding section 
now give the following lemma. 


Lemma. Let vy be the extremal segment »=0 of §29. Suppose there are 
k points (k>0), conjugate to y’s initial point A and preceding y’s final point B, 
but that A is not conjugate to B. Then within any region R enclosing ¥ in its 
interior there exists a region R, enclosing y in its interior, and corresponding 
to R,; an arbitrarily small positive constant such that the closed m-families of 
canonical curves that satisfy 


(1) 


and lie in R, are conditioned as follows: 


* Veblen, p. 131, §14. 


1928] CALCULUS OF VARIATIONS IN THE LARGE 269 


(a) Those for which m<k—1 can be deformed among canonical curves, 
within R,, and subject to (1), into a single canonical curve. 

(b) Those for which m=k—1 include for each m at least one m-family 
that cannot be deformed among canonical curves, within R,, and subject to (1), 
into a single canonical curve, or even into a (k—1)-family on an r-family for 
which r<k—1. 

(c) Those for which m=k—1 can always be deformed among canonical 
curves, within R,, and subject to (1), into an m-family of canonical curves on a 
(k—1)-family Zi-1, and, in special cases, can be further deformed into a single 
canonical curve. 


This lemma follows from the italicized statements of the preceding sec- 
tion, taking for Z,_; the m-family determined by the points (V) that corre- 
spond to the points (Y) on S;-1. 

Lemmas 1, 2, 3 of §29 tell how and when any m-family of ordinary curves 
in the (u, v) plane can be deformed into a closed m-family of canonical curves. 
The preceding lemma describes the limitations on the deformations of 
closed m-families of canonical curves. The combination of the results 
of these lemmas carried over into the (x, y) plane gives the following funda- 
mental theorem. (Here it will be convenient to call a region enclosing g 
in its interior, and consisting of points (x, y) within an arbitrarily small 


positive constant distance of g, an arbitrarily small neighborhood of g. The 
term a sufficiently small neighborhood of g will be similarly defined.) 


THEOREM 9. Let there be given in the (x, y) plane the extremal g of §29. 

On g let there be k points conjugate to A, k>O, but suppose B is not conju- 
gate to A. Let Jo be the value of the integral J along g. 

Then corresponding to any sufficiently small neighborhood R of g, there 
exists within R an arbitrarily small neighborhood R, of g, and an arbitrarily 
small positive constant e with the following properties. Closed m-families of 
curves which lie in R, which join g’s end points, and give to the integral J a 
value for which (1) holds, are conditioned as follows: 

(a) Those for which m<k—1 can be deformed in R and subject to (1) into 
a single curve. 

(b) Those for which m=k—1 include for each m at least one m-family 
that cannot be deformed, within R and subject to (1), into any (k—1)-family on 
an r-family for which r <k—1, or into a single curve. 

(c) Those for which m=k—1 can always be deformed, within R and subject 
to (1), into an m-family on a (k—1)-family, and in special cases can be further 
deformed into a single curve. 


+ 

| 

| 

4 


270 MARSTON MORSE [April 


There remains the case where A is conjugate to B. Here we need to 
assume that the problem is analytic, as well as regular, in the neighborhood 
of g. We suppose that not every extremal through A with slope at A near 
that of g passes through B. In accordance with the conventions preceding 
Theorem 3, §12, ¢g will have a type number k =s, or k=s+1, or type numbers 
k=s and s+1, where s is the number of points on g conjugate to A and 
preceding B. Ii can be shown that if k=s, or k=s+-1, the preceding theorem 
holds if the sentence concerning conjugate points be omitted, while if g is of 
the composite type where k=s and k=s-+1, the preceding theorem should be 
further altered by replacing (a), (b), and (c) by the statement that any m- 
family whatsoever lying in R,, joining A to B, and satisfying (1) can be deformed 
within R and subject to (1), into a single curve joining A to B. 

A proof of these last facts would lead us too far astray. The writer hopes 
to return to a discussion of critical points of rank less than m in a separate 
paper. 

33. An illustrative example. Consider the equator of a unit sphere. Refer 
the sphere near C to codrdinates (u, v), giving respectively the latitude and 
longitude of the point. Consider the sphere near C as now covered an infinite 
number of times by an unending strip S which overhangs the neighborhood 
of C after the manner of a Riemann surface. We may suppose S represented 
in the (u, v) plane by an unending strip R containing the u axis. We suppose 
our integral corresponds to the arc length on S. 

The case k=1. If we take a segment of the u axis whose length g lies be- 
tween mw and 27 we will have the case k=1 of the preceding theorem. In 
the strip R we can clearly take two curves which join g’s end points but other- 
wise lie on opposite sides of g, and whose lengths are equal but both less than 
g. These two curves form a 0-family which cannot be deformed into a single 
curve without passing out of R or increasing the lengths on S. 

The case k=2. Suppose now that the length g of g is between 27 and 3z. 
Here we have the case k=2. Let uo and u; correspond to the end points of g, 
and let uo, %, %2, “3 correspond to four successive points on the u axis such 
that no one of the resulting three successive intervals is as great as 7m in 
length. We consider a one-dimensional closed manifold of pairs (v, 72), 
namely the pairs 

v,(a) = bcosa, v(a) = dbsina, 


where a is any real number, and is to be our parameter, and } is a positive 
constant. If b be sufficiently small, the following will hold true. On S the 
images of the points 


0}, v2(a) |, [ (us, 0) 


1928] CALCULUS OF VARIATIONS IN THE LARGE 271 


can be successively joined by arcs of great circles which will give the shortest 
paths between their end points. The images on R of ali such paths for all 
values of a will give a closed 1-family. The lengths of the different paths of 
the family considered as a function of a will have a maximum M which is 
readily seen to be less than g. If our choice of b be sufficiently small, it follows 
from the general theory that this 1-family cannot be deformed into a single 
curve joining g’s end points without increasing the lengths beyond M or 
passing out of R. 

If however we join the end points of g by any two ordinary curves whose 
lengths on S are less than a constant g—e<g, and which lie in a sufficiently 
small neighborhood of g, then, according to the preceding theorem, these 
two curves can be deformed into each other without passing out of R, or 
using curves whose lengths exceed g—e. 

34. Osgood’s Theorem. It has doubtless been noted that the case where 
k=0 has been omitted in the preceding discussion for the reason that 
there are in this case no curves neighboring the given extremal segment 
which satisfy the inequality (1) of the preceding section. For the case k=0 
it seems to the author that the theorem that is due to Osgood* represents 
the nearest approach to the preceding developments. The questions at 
issue are sufficiently similar to have suggested to the author a proof of 
Osgood’s Theorem in the form stated by Hahn (B, p. 281) without, however, 
the hypothesis, used by Hahn and by recent writers on the calculus of variations, 
that at the ends of the extremal segment the problem be regular. Such a proof 
seems desirable because it reduces the hypotheses under which Osgood’s 
Theorem can be proved to a set identical with the least hypotheses under 
which the proper strong minimum is ordinarily established. This proof 
will be published later. 

35. Deformations of periodic extremals. Although the results here are 
different from those of the preceding sections yet the methods may be carried 
over, and afford a proof of the desired results once the problem is well 
defined. 

By an m-family of closed curves Z,, will be understood a set of ordinary 
closed curves in the (x, y) plane, the set of all of whose points make up a 
single-valued, continuous, point function of the point on a product complex 
Cn41, Obtained by combinirg an arbitrary point Q on a unit circle with an 
arbitrary point P on some m-dimensional manifold M,,. The dependence of 
the points of Z,, upon the points of C,,.4: is to be such that each individual 
closed curve is obtained by holding P fast and varying Q on the unit circle; 


* Osgood, these Transactions, vol. 2 (1901), p. 273. 


‘ 
t 
| 


272 MARSTON MORSE [April 


and the dependence of the variable point of each individual curve upon 
Q, or more particularly upon the arc length measured from a fixed point on 
the unit circle to the point Q, is to be one by virtue of which the curve may 
be termed ordinary. 

The remaining definitions and results of §§ 28, 29, 30, 31, and 32, may be 
carried over here either unchanged or with the changes obvious at each 
point. We have the following central theorem: 


THEOREM 10. Let there be given an integrand F(x, y, x, ¥) satisfying the 
hypotheses of §1 except that here F is to be of class C’’’’. Let there be given 
in S a non-degenerate (§17) periodic extremal g, at every point (x, y) of which, 
and for every 0, F(x, y, cos 0, sin 0) is positive. Let k be the type number of g, 
determined as in Theorem 5, §21. Let Jo be the value of J along g. 

Then corresponding to any sufficiently small neighborhood R of g, there 
exists within R an arbitrarily small neighborhood R, of g, and an arbitrarily 
small positive constant e with the following properties. Closed m-families 
of periodic curves which are deformable within R, into g, and which give to the 
integral J a value such that J <J y—e* satisfy the statements (a), (b), and (c) 
of Theorem 9. 


36. Birkhoff’s Theorem.* In dealing with deformations of one closed 
trajectory into another Birkhoff has given a theorem which for the case of 


dynamical systems is essentially equivalent to that special part of the pre- 
ceding theorem which has to do with the mutual deformability of two 
closed trajectories. Birkhoff does not consider such entities as m-families 
in general. However his pair of closed trajectories comes under the head 
of what the author has called closed 0-families. Birkhoff’s theorem is stated 
in terms of the Poincaré rotation number, and in terms of that rotation 
number tells when a pair of closed trajectories can be deformed into each 
other. The author’s theorem shows that periodic extremals for which k>1 
cannot be distinguished in type simply by a consideration of the mutual de- 
formability of two closed trajectories. It should be stated in explanation that 
Birkhoff was not seeking to distinguish between all types of periodic ex- 
tremals by means of deformations, but rather to determine to what types 
his “minimax principle” applied, and for that purpose his theorem was suffi- 
cient. 

Speaking generally, the results of Part VI show that the number of 
conjugate points on extremals joining two fixed points, or the type number of 
periodic extremals, is the least integer m for which some (m—1)-family of curves 


* Birkhoff, loc. cit., p. 249. 


1928] CALCULUS OF VARIATIONS IN THE LARGE 273 


cannot be deformed into a single curve subject to the conditions described in the 
text. This result suggests another new and powerful way of approaching 
a theory “in the large.” ' 


Part VII. VARIABLE END POINTS 


37. A geometric theorem. No general development for this case will 
be presented. However, a suggestive and intrinsically complete geometric 
theorem has been obtained in a problem that comes under this head, namely 
the problem of the number and nature of the normals which can be drawn 
from a fixed point P in a euclidean (n+1)-space E,4; to an m-spread S, in 
that space. Proofs of the following results will be published later. 

The n-spread S, is to be a manifold without boundary in the sense of 
Veblen (loc. cit.). Moreover we assume that the points on S, in the neighbor- 
hood of any given point of S, are representable by an equation which gives 
one of the rectangular codrdinates as a function of class C’’ of the others. 
A preliminary result is that the loci of centers of principal normal curvature* 
C, corresponding to S, are not space filling in Z,4;, that is, there are in the 
neighborhood of each point of Z,4: points not on C,. Let P be any point not 
on C, or S,. A second preliminary result is that any straight line L through 
P will be normal to S, at no more than a finite number of points Q; (¢=1, 2, 

- +, m). Each of the m segments PQ; of L will be counted as a different 
normal from P to S,. A normal PQ; will be said to be of type k if there are k 
centers of principal normal curvature corresponding to Q;, on L, between P 
and Q;. We have the following theorem. 


THEOREM. Let P be any point not on the manifold S,, nor on the loci of 
centers of principal normal curvature of S,, and let M;, be the number of different 
normals from P to S,, of type k. Then between the numbers M,, and the connec- 
tivities R; of S,, the following relations hold true: 


Mo 21+ (Ro 1), 
Mo — Mi 51+ (Ro — 1) — (Ri — 1), 


My — Mi + M2 21+ (Ro— 1) — (Ri— 1) + (R: 1), 


My — Mi + ---+(— = 14+ (Re— 1) — + (— 1), 


where the inequality signs alternate until the final equality is reached. 


* Eisenhart, Riemannian Geometry, p. 153. 


| 
| 
| 
| 
| 
| 


274 MARSTON MORSE 


To this theorem* we add the result that points Q on S, such that the 
straight line segment PQ gives a relative minimum or maximum to the 
distance from P to S,, yield normals PQ, respectively of types 0 or n. We 
now make the following specific applications of the theorem. 

In 3-space let S, be homeomorphic to a sphere. Here 


If there are r normals from P to S; which give a relative minimum to the 
distance from P to S:, and s normals that give a relative maximum, there 
are r+s—2 other normals of type 1. The total number of normals is always 
even. 

In 3-space let S, be homeomorphic to a torus. Here 


Ro = 1, R, = 3, Rz = 2, 
M,2 1, M,22, M:2 1. 


Thus there are always four normals at least. The total number of normals 


is always even. 
In 4-space let S; be homeomorphic to a manifold obtained by identifying 
the opposite faces of an ordinary cube. Here 


Ro = Bs R, 4, R 
1 


2 4, R; = 2, 
M,2 1, M M, 2 3, M;21. 


There are thus at least eight normals, while the total number of normals is 
always even. 


*A theorem essentially the same as the above has been proved to hold in the general problem 
one variable end point in m dimensions. 


Harvarp UNIVERSITY, 
CAMBRIDGE, Mass. 


Ro=1, Ri: = 1, Re = 2. 


TOPOLOGICAL INVARIANTS OF KNOTS AND LINKS* 


BY 
J. W. ALEXANDER 


1. Introduction. The problem of finding sufficient invariants to 
determine completely the knot type of an arbitrary simple, closed curve 
in 3-space appears to be a very difficult one and is, at all events, not solved 
in this paper. However, we do succeed in deriving several new invariants 
by means of which it is possible, in many cases, to distinguish one type of 
knot from another. There exists one invariant, in particular, which is quite 
simple and effective. It takes the form of a polynomial A(x) with integer 
coefficients, where both the degree of the polynomial and the values of its 
coefficients are functions of the curve with which it is associated. Thus, for 
example, the invariant A(x) of an unknotted curve is 1, of a trefoil knot 
1—x-+2?, and so on. At the end of the paper, we have tabulated the various 
determinations of the invariant A(x) for the 84 knots of nine or less crossings 
listed as distinct in the tables of Tait and Kirkman. It turns out that with 
this one invariant we are able to distinguish between all the tabulated 
knots of eight or less crossings, of which there are 35. Repetitions of the 
same polynomial begin to appear when we come to knots of nine crossings. 

The invariants found in this paper are all intimately related to the so- 
called knot group, as defined by Dehn. This is, of course, what one would 
expect; for many, if not all, of the topological properties of a knot are reflected 
in its group. The knot group would undoubtedly be an extremely powerful 
invariant if it could only be analyzed effectively; unfortunately, the problem 
of determining when two such groups are isomorphic appears to involve 
most of the difficulties of the knot problem itself. 

In §11, we indicate, very briefly, how the results obtained for knots may 
be generalized to systems of knots, or links. We also establish the connection 
between the new invariants derived below and the invariants of the n-sheeted 
Riemann 3-spreads (generalized Riemann surfaces), associated with a knot. 

2. Knots and their diagrams. In order to avoid certain troublesome com- 
plications of a point-theoretical order we shall always think of a knot as 
a simple, closed, sensed polygon in 3-space. A knot will, thus, be composed 
of a finite number of vertices and sensed edges. We shall allow ourselves to 
operate on a knot in the following three ways: 


* Presented to the Society, May 7, 1927; received by the editors, October 13, 1927. 
275 


276 J. W. ALEXANDER [April 


(i) To subdivide an edge into two sub-edges by creating a new vertex 
at a point of the edge. 

(ii) To reverse the last operation: that is to say, to amalgamate a pair 
of consecutive collinear edges, along with their common vertex, into a single 
edge. 

(iii) To change the shape of the knot by continuously displacing a 
vertex (along with the two edges meeting at the vertex) in such a manner 
that the knot never acquires a singularity during the process. It would, of 
course, be easy to express this third operation in purely combinatorial terms. 

Two knots will be said to be the same tye if, and only if, one of them is 
transformable into the other by a finite succession of operations of the three 
kinds just described. A knot will be said to be unknotted if, and only if, it is 
of the same type as a sensed triangle. 

To make our descriptions a trifle more vivid we shall often allow ourselves 
considerable freedom of expression, with the tacit understanding that, 
at bottom, we are really looking at the problem from the combinatorial point 
of view. Thus, we shall sometimes talk of a knot as though it were a smooth 
elastic thread subject to actual physical deformations. There will, however, 
never be any real difficulty about translating any statement that we make 
into the less expressive language of pure, combinatorial analysis situs. In 
the figures, we shall picture a knot by a smooth curve rather than by a poly- 
gon. A purist may think of the curve as a polygon consisting of so many 
tiny sides that it gives an impression of smoothness to the eye. 

A knot will be represented schematically 
by a 2-dimensional figure, or diagram. In 
the plane of the diagram a curve, called the 
curve of the diagram, will be traced picturing 
the knot as viewed from a point of space 
sufficiently removed so that the entire knot 
comes, at one time, within the field of vision. 
The curve of the diagram will ordinarily 
have singularities, but we shall assume that 
the point of observation is in a general posi- 
tion so that the singularities are all of the 
simplest possible sort: that is to say, double 
points with distinct tangents. The singular- 
ities of the curve of the diagram will be called 
crossing points, and the regions into which it subdivides the plane regions 
of the diagram. At each crossing point, two of the four corners will be dotted 
to indicate which of the two branches through the crossing point is to be 


fo 


Fic. 1 


. 


1928] TOPOLOGICAL INVARIANTS OF KNOTS 277 


thought of as the one passing under, or behind the other. The convention 
will be to place the dots in such a manner that an insect crawling in the 
positive sense along the “lower” branch through a crossing point would 
always have the two dotted corners on its left. Two corners will be said to be 
of like signatures if they are either both dotted or both undotted; they will 
be said to be of unlike signatures if one is dotted, the other not. Figure 1 
represents a diagram of one of the two so-called trefoil knots. 

To each region of a diagram a certain integer, called the index of the 
region, will be assigned. We shall allow ourselves to choose the index of any 
one region at random, but shall then fix the indices of all the remaining 
regions by imposing the requirement that whenever we cross the curve from 
right to left (with reference to our imaginary insect crawling along the curve 
in the positive sense) we must pass from a region of index #, let us say, 
to a region of next higher index +1. Evidently, this condition determines 
the indices of all the remaining regions fully and without contradiction. 
To save words, we shall say that a corner of a region of index is itself of 
index p. 


P P ptt 


(a) (b) (c) 
Fic. 2 


It is easy to verify that at any crossing point c there are always two 
opposite corners of the same index p and two opposite corners of indices p—1 
and p+1 respectively. The index p associated with the first pair of corners 
wili be referred to as the index of the crossing point c. Two kinds of crossing 
points are to be distinguished according to which branch through the point 
passes under, or behind, the other. A crossing point of the first kind, Fig. 2a, 
will be said to be right handed, one of the second kind, Fig. 2b, left handed. 
At either kind of point the two undotted corners are of indices p—1 and p 
respectively, the two dotted ones of indices p and +1. However, at a right 


278 J. W. ALEXANDER [April 


handed point the dotted corner of index p precedes the dotted corner of 
index +1 as we circle around the point in the counter clockwise sense, 
whereas at a left handed point it follows the other. At a crossing point c, 
the two corners of like index p may belong to the same region of the diagram. 
We observe for future reference that on the boundary of a region of index p 
only crossing points of indices p—1, p, and +1 may appear. Finally, we 
recall again that the entire system of indices is determined to within an 
additive constant only, since the index of some one region or crossing point 
has to be assigned before the indexing of the figure as a whole becomes de- 
terminate. 

3. The equations of a diagram. In reality, the same diagram represents 
an infinite number of different knots, but this indetermination is, if anything, 
an advantage, as the knots so represented are all of the same type. The knot 
problem is the problem of recognizing when two different diagrams represent 
knots of the same type. Now, to tell the type of knot determined by a dia- 
gram it is evidently not necessary to know the exact shapes of the various 
elements of the diagram, but only the relations of incidence between the ele- 
ments and the signatures at the corners of the regions. Because of this 
fact, the essential features of a diagram may all be displayed schematically 
by a properly chosen system of linear equations, as we shall now prove. 

If a diagram has » crossing points 


we find, by a simple application of Euler’s theorem on polyhedra, that it must 
have v+2 regions 


(3.2) Gj =0,1,---,»+1). 


Now, suppose the four corners at a crossing point c; belong respectively to 
the regions r;, rx, 7:, and r,,, that we pass through these regions in the cyclical 
order just named as we go around the point c; in the counterclockwise sense, 
and that the two dotted corners are the ones belonging to the regions 7; 
and 7, respectively. Then, corresponding to the crossing point c; we shall 
write the following linear equation: 


(3.3) cr) = x7; — ore tri — tm = 0. 


The v equations (3.3) determined by the v crossing points c; will be called the 
equations of the diagram. The cyclical order of the terms in the left hand 
members of these equations plays an essential réle and is not to be disturbed. 
The distribution of the coefficients x determines in which corners of the 
diagram the dots are located. 


/ / 
Va / / 


SUGGESTIONS TO AUTHORS 


Much needless expense and many errors can be avoided. The 
editors of several mathematical journals have agreed upon the 
following suggestions. 


Typewrite words and the very simplest formulas only. 

. Do not try to typewrite any complex formulas. Write them. 
Keep a copy, and send the ecitors two copies, if you can. 

Do not underline any symbols or any formulas. 

Underline theorems with blue pencil (avoid ink). 

. Follow our recent sij’2s is: abbreviations, footnotes, etc. 

. Write carefully the (often misunderstood) capitals C K P S V W X Z. 
. Write e, not «. Write very carefully Av w. 

. Among Greek capitals, use only TAG Q. 

. Punctuate carefully, especially in formulas; thus: 1, 2,--- , ”. 
. Use the solidus (/) te avoid fractions in solid lines. 

. Use fractional exponents to avoid root signs everywhere. 

. Use extra symbols to avoid complicated exponents. 

14. In typewritten formulas, ] means “one”; to indicate “ell” in formulas, back- 
space and overprint /; thus: J. Similarly, ( means “zero”; to indicate “cap O,” 
backspace and overprint period; thus: Q. 

15. Avoid a dash over a letter, except for those shown below. 

16. Some samples of unusual types available on monotype machines follow. A 
more complete list of all such types will be sent on request. 


ONE 


Light Face Greek—a y -- (all) ABT - - - (all). 
* Light Greek Superiors—4 and 7 --- (all except « and o). 
* Light Greek Inferiors—,azo and ag 7... (all except « and o). 
* Boldface Greek—a e{ nOyviExo édwandQ. 
* Lightface German—abchbpqg 
USWEY 3B. 
* Boldface German—d & B D 
Script (special fonte4 BC - - - (all). No lower case manufactured. 
* Hebrew— W 8 5D 3 troublesome to handle. 
* Dashed Italics—A a BOT 
* Tilda Italics—A OF AF 
Tilda Greek—é& & @ 
* Dashed Greek—a A754 
Dotted Italic—a 
* Dotted Greeck—7 66 (single dotted ¢ @ 5 B 7; double 
dotted y readily available). 


* Additional characters readily available at small cost. 
* Matrices for additional characters are made upon special orders and necessitate 
a delay of from four to eight weeks and average expense of $4.50 per matrix. 


1928] TOPOLOGICAL INVARIANTS OF KNOTS 279 


By way of illustration we shall write out the equations of the diagram 
of the trefoil knot (Fig. 1). They are as follows: 


ci(r) = — aro +73 — = O, 


(3.4) cor) = — aro +71 — = 


= ar, — aro tre = 0. 


The equations of a diagram determine the structure of the diagram com- 
pletely unless there happen to be two or more edges incident to the same 
pair of regions. For, barring this exceptional case, two cyclically consecutive 
terms in any equation correspond to a pair of regions that are incident 
along one edge only, and, therefore, determine the edge itself. In other words, 
the equations of the diagram tell us the incidence relations between the edges 
and crossing points. But they also tell us the relative position of the four 
edges at a crossing point; therefore, we have all the information needed to 
reconstruct the curve of the diagram. Moreover, the distribution of the 
coefficients x tells us how the corners must be dotted. 

In the exceptional case, where the boundaries of two regions have more 
than one edge in common we are either dealing with the diagram of a 
composite knot K or with a diagram that admits of obvious simplification. 
Suppose the edges e; and e, are on the boundary of each of two regions 
and rz. Then, if we join a point P, of the edge e; to a point P2 of the edge 
é. by means of an arc a lying wholly within the region 7:, the extremities of 
the arc a will subdivide the curve of the diagram into two non-intersecting 
arcs 7, and 2 which may be combined respectively with the arc a to form 
the two closed curves 


at+y7, 7. 


Moreover, these last two curves may be regarded as the diagram curves of a 
pair of non-interlinking knots K, and K;2 in space. If neither of the knots 
K, nor Kz is unknotted we may regard K, and K; as factors of the composite 
knot K. If one of them, Ky, is unknotted, the knot K must evidently be of 
the same type as the other one, K. Hence, in this case, the diagram of the 
knot K may be replaced by the simpler diagram of the knot Ke. 

4. The invariant polynomial A(x). Let us now treat the equations of the 
diagram as a set of ordinary linear equations E in which the ordering of the 
terms in the various left hand members is immaterial. Then, the matrix of 
the coefficients of equations £ will be a certain rectangular array M of v rows 
and v+2 columns, one row corresponding to each crossing point and one 
column to each region of the diagram. We shall presently show that the 


280 J. W. ALEXANDER [April 


matrix M has a genuine invariantive significance; for the moment, let us 
merely observe that it has the following property: 


If the matrix M is reduced to a square matrix Mo by striking out two of its 
columns corresponding to regions with consecutive indices p and p+1, the de- 
terminant of the residual matrix Mo will be independent of the two columns 
struck out, to within a factor of the form +x". 


To prove the theorem, let us introduce the symbol R, to denote the 
sum of all the columns corresponding to the regions of index p and the 
symbol 0 to denote a column made up exclusively of zero elements. Then, 
we obviously have the relation 


(4.1) DR, = 0; 


for in each row of the matrix there are only four non-vanishing elements, 
namely x, —x, 1, and —1, and the sum of these four elements is zero. We 
also have the relation 


(4.2) R, = 0; 


for if we multiply the elements of each column by a factor x~?, where p is 
the index of the (region corresponding to) the column, the four non-vanishing 
elements in a row of index become x!~*, —x!~1, and —x~¢ respectively, 
so that their sum is again zero. By properly combining relations (4.1) and 
(4.2) we obtain the relation 


(4.3) > — 1)R, = 0 


in which the term in Rp disappears. 
Now, let 
+ Apg(x) = + Agy(x) 
be the determinant of any one of the matrices M,, obtained by striking out 
from the matrix M a pair of columns of indices p-and g respectively. Then, 
by (4.3), we clearly have 


(4.4) — 1)Ao,(x) = + — 1)Agg. 


For relation (4.3) tells us that a column of index p multiplied by the factor 
x-®—1 is expressible as a linear combination of the other columns of the 
matrix M of indices different from zero (that is to say, of columns of the 
matrix M>,), and that in this linear combination the coefficients of the 
columns of index g are —(x~*—1). Moreover, since indices are determined 
to within an additive constant only, relation (4.4) gives us 


TOPOLOGICAL INVARIANTS OF KNOTS 
— = + (27? — 1) Arg, 


— 1)Ag, = + (2-7 — 
whence, 
— 1 
(4.5) = + 


— 


But, as a special case of (4.5), we have the relation 


which proves the theorem. 

Let us now divide the determinant A,;,,:) by a factor of the form +2* 
chosen in such a manner as to make the term of lowest degree in the resulting 
expression A(x) a positive constant. Then, 


The polynomial A(x) is a knot invariant. 


The theorem will be proved in §6 and again in §10, as a corollary to a more 
general theorem. 

Let us actually evaluate the invariant A(x) in a simple, concrete case. 
From the equation of the diagram of the trefoil knot, (3.4), we obtain the 
matrix 


(4.7) 


Now, if we assign indices in such a way that the first row of the matrix is 
of index 2, the next three rows will be of index 1 and the last row of index 0. 
The determinant Aj obtained after striking out the last two rows of the 
matrix (4.7) will be 


An = — x(1 — x + 2?) ; 
the determinant obtained after striking out the first two rows, 
Ai2(x) (1 —z+ 


The difference between these two expressions is of the sort predicted by 
relation (4.6). The invariant A(x) is, of course, 


A(x) =1— x+ 


5. Further new invariants. It will now be necessary to obtain a some- 
what more precise theorem about the matrix M than the one proved in §4. 


— x 0 x 1 —1 
—x x 1 0 —1. 


282 J. W. ALEXANDER [April 


Any two columns of the matrix M of consecutive indices p and p+1 may be 
expressed as linear combinations of the remaining v columns, where the coeffi- 
cients of the two linear combinations are polynomials in x with integer coefficients. 


Here, and elsewhere throughout the discussion, we shall use the term 
“polynomial” in the broad sense, so as to allow terms in negative as well as 
positive powers of the mark «x to be present. 

Since indices are determined to within an additive constant only, we may 
assume that is zero in proving the theorem. Now, in relation (4.3) there 
is no term in Ro, and the coefficient of the term in R; is x-!'—1. Let us divide 
the coefficients of all the terms in (4.3) by this last expression so as to make 
the coefficient of R, equal to unity. The coefficients of the remaining terms 
will then be expressible as polynomials in the broad sense; for if p is positive, 
we have 

while if is negative, we have 


Therefore the simplified relation (4.3) tells us that any column of index 1 
is expressible as a linear combination with polynomial coefficients of columns 
of indices different from zero. But if we start from the relation 


— 1)R, = 0 


which also follows, at once, from (4.1) and (4.2), we conclude, by a similar 
argument, that any column of index 0 is expressible as a linear combination 
with polynomial coefficients of columns of indices different from one. The 
theorem follows at once. 

Two matrices M, and M; will be said to be equivalent if it is possible to 
transform one of them into the other by means of the ordinary elementary 
operations allowed in the theory of matrices with integer coefficients: 

(a) Multiplication of a row (column) by —1. 

(8) Interchange of two rows (columns). 

(y) Addition of one row (column) to another. 

(6) Bordering the matrix with one new row and one new column, where 
the element common to the new row and column is 1 and the remaining 
elements of the new row and column are 0’s; or the inverse operation of 
striking out a row and a column of the type just described. 

Two matrices M, and M, will be said to be e-equivalent if it is possible 
to transform one of them into the other by means of the operations (a), 
(8), (vy), (6), along with the further operation 


1928] TOPOLOGICAL INVARIANTS OF KNOTS 283 


(€) Multiplication or division of a row (column) by x. Two polynomials 
will be said to be ¢-equivalent if they differ, at most, by a factor of the form 
+x". We now state the following theorem, which will be proved in the next 
section and again in §10. 


If two diagrams represent knots of the same type their matrices M are 
e-equivalent. 


As a corollary to this theorem it follows that 


If two diagrams represent knots of the same type the elementary factors of 
their matrices M are e-equivalent, barring factors of the form +x? (those 
e-equivalent to unity). 


For operations (a), (8), and (y) leave the elementary factors invariant; 
operation (6) merely introduces or suppresses a unit factor; operation (e) 
merely changes one of the factors by a factor x. By an elementary factor 
of a matrix M we here mean the highest common factor of all the minors of 
the matrix M of any given order p divided (for p>1) by the highest common 
factor of all minors of order p—1. If all the minors of order p vanish, the 
corresponding factor is zero. 

The theorem about the invariance of the polynomial A(x) announced in 
$4 is an obvious consequence of the corollary, for A(x) is e-equivalent to the 
product of the elementary factors of the matrix M. 

The theorem suggests the problem of finding normal forms for the 
matrices M under operations (a), (8), (vy), (6), and (€). Under this par- 
ticular group of operations, the elementary factors of a matrix M are not a 
complete set of invariants. They would be if we replaced operation (a) 
by the more general operation 

(a’) Multiplication of a row (column) by an arbitrary rational number. 

The matrix M, obtained by deleting from the matrix M two columns of con- 
secutive indices p and p+1 is €-equivalent to the matrix M. 

This follows, immediately, from the first theorem proved in this section. 

It should be remarked that the matrix N obtained by changing the signs 
of all negative elements of the matrix M is equivalent to the matrix M. For 
if we change the signs of all the elements of M belonging to the columns of 
odd indices we obviously obtain a matrix of such a form that the elements in 
any given row are of like sign. Therefore, by further changing the signs of 
the elements in the rows containing no positive elements we obtain the 
matrix N. Thus, the matrix M is transformable into the matrix N by ele- 
mentary transformations. For theoretical purposes, the matrix M offers 
certain advantages over the matrix V; when actual computations are to be 


284 J. W. ALEXANDER [April 


made, the matrix N is generally to be preferred, as mistakes in sign are 
less likely to be made when it is used. 

6. Diagram transformations. When a knot is deformed, the equations 
of its diagram remain invariant so long as the topological structure of the 
diagram does not change. Now, a change in the structure of the diagram may 
come about in one or another of the following ways: 

(A) The curve of the diagram may acquire a loop and crossing point 
(Fig. 3) or it may lose a loop and crossing point by a deformation of the 
inverse sort. 


Fic. 3 


(B) One branch of the curve may pass under another with the creation 
of two new crossing points (Fig. 4); or by a deformation of the inverse 
sort, one branch may slide out from under another with the loss of two cross- 


Fic. 4 


ing points. In this case it must be borne in mind that the corners at the 
two crossing points must be so dotted as to imply that the lower branch at 
one is also the lower branch at the other. 


4 
% 
” 
' 


1928] TOPOLOGICAL INVARIANTS OF KNOTS 285 


(C) If there is a three-cornered region in the diagram, bounded by 
three arcs and three crossing points, and if the branch corresponding to 
one of the three arcs passes beneath the branches corresponding to the other 
two, then any one of the three branches may be deformed past the crossing 
point formed by the intersection of the other two (Fig. 5). The effect is 
the same, topologically speaking, whichever of the three branches undergoes 
the deformation. 


Fic. 5 


It is a simple matter to verify that any allowable variation in the structure 
of the diagram may be compounded out of variations of the three simple 
types indicated above.* 

With these facts before us, it is now easy to prove the theorem about the 
e-equivalence of the matrices of two diagrams which determine knots of 
the same type. For it is sufficient to show that under each of the trans- 
formations (A), (B), and (C) the matrix of the diagram is carried into an 
€-equivalent one. 

First, consider case (A), where a branch of the curve acquires a new loop 
and crossing point. Let Moy be the matrix of the original diagram after 
the two redundant columns corresponding to the region r; and re (Fig. 3), 
have been struck out. Then, the effect of the transformation is merely to 
border the matrix M, with a new row and column in which all the elements 
are zero except the one which the row and column have in common. This 
last element will be +1 or +x according to how the corners at the new 
crossing points are dotted. Evidently, the new matrix is e-equivalent to the 
original one. 


* Cf., for example, Alexander and Briggs, On types of knotted curves, Annals of Mathematics, 
(2), vol. 28 (1927), pp. 563-586. 


re 
ef & 
Kg 
% 
a 
4 
"3 


286 J. W. ALEXANDER [April 


Under case (B), let Mo be the matrix of the original diagram after the 
two redundant columns corresponding to the regions r; and rz (Fig. 4) have 
been struck out, and let Mj; be the corresponding transformed matrix. 
Then, if we add column rj of Mg to column rj we obtain the original matrix 
M, bordered by two new rows and columns in the manner indicated sche- 
matically by the following figure: 


1 1/0 
0 
0 


Thus, clearly, the new matrix is again e-equivalent to the old. It may happen 
that the corners are not dotted in the manner indicated in the figure, but 
that the two corners of the region rg are dotted ones. The method of proof is, 
however, essentially the same in this case as in the case just considered. 

In disposing of case (C), we shall replace the matrix M of the original 
diagram by the equivalent matrix N, as defined at the end of §6, so as not 
to be troubled about the correct evaluation of the signs of the various 
elements. Moreover, we shall change the réles of the rows and columns of the 
matrix N and think of this last as the matrix of a set of equations in the 
symbols c; associated with the crossing points, rather than in the symbols 7; 
associated with the regions. Then, with the aid of Fig. 5, we may verify, 
at once, that the matrix N undergoes the transformation induced by the fol- 
‘owing change of symbols: 


= — 


1 
(6.1) 2 = x01 + 6s, 
3 


Cc C1 + Ce. 


In making the verification we must not overlook the relations 


+ Co = 0 
and 
ci + + =0 
corresponding to the regions rp and rg (Fig. 5) respectively. Now, the sub- 
stitution (6.1) is, clearly, the product of a substitution 


= 


which induces an ¢-operation on the matrix N, and a substitution of de- 
terminant unity which induces a set of elementary operations, by a well 
known theorem. Therefore the matrix N is carried over into an ¢-equivalent 
one. The cases where the corners are dotted in a different manner from that 
shown in Fig. 5 are treated in a similar manner. 


1928] TOPOLOGICAL INVARIANTS OF KNOTS 287 


This completes the proof of the invariantive character of the matrix M, 
whence, also, it follows that the polynomial A(x) is an invariant. 

In the next sections we shall establish the connection between the matrix 
M and the so-called group of the knot, as defined by Dehn. This will necessi- 
tate the interpolation of a few preliminary remarks bearing, very largely, 
on questions of terminology and notation. 

7. Abstract groups. In expressing the resultant of two or more operations 
of a group A we shall use the summation, rather than the product notation. 
Thus, if the symbols aj, a;,- - - represent operations of the group, the 
symbol —a; will represent the inverse of the operation a;, the symbol a;+a; 
the resultant of the operation a; followed by the operation a;, the symbol 0 
the identical operation. Furthermore, the symbol Aaj, where ) is any positive 
integer, will denote the resultant of the A-fold repetitions of the operation a;, 
and the symbol —)a; the resultant of the \-fold repetitions of the operation 
—a;. It goes without saying that we must distinguish, in general, between 
the operations a;+a; and a;+a;. 

If two consecutive terms of a sum of operations 


involve the same letter a;, they may be contracted into a single term in a,,. 
After all possible contractions of the sort have been made the sum c(a,) 


will be said to be in its reduced form. We shall use the identity sign between 
two sums, 


c(a;) = d(a,), 


to indicate that the sums, when reduced, are formally identical. An equality 
sign between two symbols 


c=d 
will merely indicate that the two symbols represent the same operation of 


the group, without implying their formal identity. 
Let 


(7. 1) aj m) 
be a set of generators of a group A: that is to say, a set of operations of the 
group A in terms of which all the operations of A may be expressed. Then, 


in most cases, there will exist certain identical combinations of the generators 
of the form 


(7.2) = 0 (Gg =1,2,-++,m). 


288 J. W. ALEXANDER [April 


Now, if we know any set of identities (7.2) among the generators there are 
three standard processes whereby we may enlarge the set (7.2) by the forma- 
tion of new identities: 

(i) The process of inversion, giving identities of the form 


(7.3) — = 0; 

(ii) The process of summation, giving identities of the form 
(7.4) + cx(as) = 0; 

(iii) The process of transformation, giving identities of the form 
(7.5) e(a:) + ¢;(a:) — e(a;) = 0, 


where e(a;) is any operation of the group. 

The set (7.2) will be said to be complete if there is no identical combination 
of the generators which cannot ultimately be brought into the set by repeated 
application of the three processes just indicated. Thus, if the set (7.2) 
is complete, the most general identical combination of the generators must 
be of the form 


(7.6) c= Liles + ¢;,— = 0. 


A group A is fully determined by a set of generators (7.1) together with a 
complete set of identities (7.2) among them. In all of the discussion we 
shall confine our attention to the case when the number of generators (7.1) 
and defining identities (7.2) is finite. 

With every group A there is associated a commutative group A, de- 
termined by adjoining to the defining identities (7.2) of A all possible rela- 
tions among the generators of the form 


a,+4;=a;+ 


The group A. may also be thought of as the one determined by the generators 
(7.1) and identities (7.2) alone, where, however, we must now assign a new 
meaning to our symbolism and regard addition as commutative. If we do 
this, equations (7.2) simplify, by collecting terms in like symbols, to the form 


(7.7) = = 0 


while relation (7.6) which displays the form of the most general identical 
combination among the generators becomes 


(7.8) c= dic; = 0. 


1928] TOPOLOGICAL INVARIANTS OF KNOTS 289 


The group A, is, obviously, an invariant of the group A, whence, also, 
its own invariants are invariants of A. For future reference, we quote 
without proof a classical theorem about the commutative group A.. Let 
|| e:;|| denote the matrix of the coefficients in equation (7.7). Then 


The elementary factors of the matrix 
ll I 


which differ from unity form a complete set of invariants of the group A.. 
Therefore, also, they are invariants of the group A. 


8. The knot group. The group R of a knot, as defined by Dehn,* is merely 
the ordinary topological group of the space S exterior to the knot. Let us fix 
upon some point of the space S, such as the observation point P from which 
we are supposed to be viewing the knot when we look at its diagram. Then, in 
the space S, each closed, sensed curve beginning and ending at P determines 
an operation of the group R. Moreover, two different sensed curves deter- 
mine the same operation if, and only if, one may be deformed continuously 
into the other within the space S, while its ends remain fixed at the point P. 
During the deformation the curve may cut through itself at will, but it must 
never come into contact with the knot, as that would involve its leaving the 
space S. The condition that a curve determine the identical operation is 
that it be continuously deformable into the point P itself. If two sensed 
curves C; and C; correspond respectively to the operations 7; and rz of the 
group R, the sensed curve C,+C; obtained by joining the initial point of C; 
to the terminal point of C; determines the operation 7:+7.. 

The group R of a knot may be obtained, at once, from a diagram of the 
knot.f Let us flatten out the knot until it coincides, sensibly, with the curve 
of its diagram. Then, if we pick out, at random, a region ro of the diagram, 
there will be one generator of the group corresponding to each of the other 
v+1 regions 7;, where the generator in question is the operation determined 
by a curve which starts from the point P, crosses the region 7;, passes behind 
the plane of the diagram, and returns to the point P by way of the region ro. 
It is easy to see, by inspection, that to each crossing point of the diagram 
there corresponds an identical relation of the form 


(8.1) 90, 


*M. Dehn, Topologie des dreidimensionalen Raumes, Mathematische Annalen, vol. 69 (1910), 
pp. 137-168. 
t Dehn, loc. cit. 


290 J. W. ALEXANDER [April 


where if the symbol 7, appears in this relation it must be set equal to zero. 
Moreover, it is not difficult to verify} that the set of relations (8.1) is com- 
plete. We, therefore, have the following theorem: 


The group R of a knot is the one determined by the equations of the diagram, 
§3, when we set x equal to unity, together with one more equation of the form 


ro = 0. 


9. Indexed groups. We shall now make another short digression leading 
to a generalization of the theorem quoted at the end of §7. A group A will be 
said to be indexed if with each operation of the group there is associated an 
integer, called the index of the operation, such that 

(i) The index of the identical operation is zero; 

(ii) There exists an operation of index unity; 

(iii) The index of the resultant of two operations is the sum of the 
indices of the two operations. 

Two indexed groups will be said to be directly equivalent if they are 
related by a simple isomorphism pairing elements of like indices, and in- 
versely equivalent if they are related by a simple isomorphism pairing ele- 
ments of index p (p=0, +1, +2, - - - ) with elements of index —p. 

Let A be an indexed group determined by a finite number of generators 
connected by a finite number of identical relations. Then, clearly, the 
generators may always be chosen in the canonical form 


(9.1) * ** Mes 


where the first generator s is of index 1 and the others a, are of index 0. For 
any arbitrary finite set of generators may be reduced to the above form by a 
process analogous to the one used in finding the highest common factor of a 
set of integers. The defining identities of the group, expressed in terms of the 
generators (9.1), will be certain linear expressions which we shall denote by 


(9.2) @1, , Gn) = O. 


Now, it will be observed that the operations of the group A which are of 
index 0 determine a self-conjugate subgroup A* of A. Let a be any operation 
of this subgroup. Then if the operation a is expressed in terms of the genera- 
tors (9.1) of the group A, 


(9.3) a(s, Gi, G2, an), 


the sum of the coefficients of the terms in s must evidently vanish. Con- 


+t Dehn, loc. cit. 


1928] TOPOLOGICAL INVARIANTS OF KNOTS 291 


sequently, if we interpolate between every two terms of (9.3) a pair of 
redundant terms of the form —As+As, where the various coefficients \ 
are suitably determined, we shall abtain a representation of the operation a 
in the form 


(9.4) + ap; dis). 


It will be convenient to introduce the abridged notation 
+ xa; = As + aj — As 


to denote a succession of three terms like the ones appearing in the sum 
(9.4). We shall then be able to express the sum a in the form 


(9.5) a= > + xa,,. 


Conversely, every sum of terms +2a,, represents an operation of index 0: 
that is to say, an operation of the subgroup A* oi A. In particular, the de- 
fining relations (9.2) of the group A may be written 


(9.6) c(x, a) = + = 0. 


Now, let us reexamine the three processes (7.3), (7.4), and (7.5) of §7, 
‘whereby new identities are to be formed from the identities of a given set 
(9.6). The first two processes require no particular comment; in the new 
notation they may be expressed by 


(9.7) —c,(x,a) =0 

and 

(9.8) a) +c,(x, a) = 0 
respectively. The third process, however, is expressed by 
(9.9) e(x, a) + 2c;(x, a) — e(x,a) = 0, 


where the presence of the coefficient x* before the middle term is to be ac- 
counted for by the fact that in reducing expression (7.5) to the form (9.9) 
we must, in general, interpolate a pair of redundant terms —As-+As after 
the term e(a;) in order to obtain a group of terms 


e(x, a) = e(a;) — As 


of index zero. Relation (7.6) which exhibits the most general identity among 
the generators (9.6) becomes, in the abridged notation, 


(9.10) c(x, a) = > a) + a) — e(x, a)] = 0. 


292 J. W. ALEXANDER [April 


We may now regard the subgroup A* of A as determined by the generators 


of A of indices zero together with the identities (9.6), which, for convenience, 
we shall now rewrite: 


(9.12) a) = 0. 


Relation (9.10) shows us how to form the most general identity in the genera- 
tors di. 

With the group A* there is associated a commutative group A.* de- 
termined by adding to the identities (9.12) all relations of the form 


xa; + wa; = + wa; + a;. 


The group A * bears the same relation to the group A* as the group A, of §7 
to the group A. It may be thought of as the one determined by the generators 
(9.11) and identities (9.12) alone, where, as in §7, we change the meaning 
of our notation and regard addition as commutative. Here, however, we 
must bear in mind that it is only when we express the operations of the 
group A-* in abridged notation that addition is commutative. The operations 
a and 


xa =rAs+a—kds 


are still to be regarded as distinct, otherwise we would get only trivial 
results. The defining identities (9.12) of the commutative group A.* may 
evidently be simplified, by collecting the terms in the various symbols a;, 
to the form 


(9.13) cx, a) = = 0, 


where the coefficients X;; are polynomials in x with integer coefficients. 
(Here, again, we are using the term “polynomial” in the broad sense so as 
to allow negative as well as positive powers of x to be present.) Relation 
(9.10) exhibiting the form of the most general identity in the generators 
simplifies, in this case, to 


(9.14) c(x, a) a) 0, 


where the coefficients X; are also polynomials in the broad sense. 

Now, let ||Xi;| be the matrix of the coefficients of equations (9.13). 
Then, as a generalization of the theorem quoted at the end of §7, we shall 
have the following proposition: 


1928] TOPOLOGICAL INVARIANTS OF KNOTS 293 


If two indexed groups A and B are directly equivalent, their associated 
matrices ||X;;\| and || Y;;|| are €-equivalent. 


The proof will be made in a series of easy steps. Let us choose a canonical 
set of generators 


(9.15) S, °** Om 

of the group A, and a set of defining identities 

(9.16) c,(x, a) =0 (xa =s+a-—s). 
Then, corresponding to these last, we shall have the identities 

(9.17) >> Xija; = 0 


of the commutative group A.* associated with the group A. 

Now, let us observe the following simple facts: 

(i) If we enlarge the set (9.16) by the adjunction of one new relation which 
is dependent on the ones already in the set, the matrix ||X;,|| is, thereby, 
transformed into an e-equivalent one. For, by relation (9.4), the matrix 
||X;,|| merely acquires a new row, expressible as a linear combination 
of the old ones with polynomial coefficients. 

(ii) If we adjoin a new generator @,,,; of index zero to the set (9.15) and, 
_ at the same time, add to relations (9.16) an identity of the form 


Cm+i(X, 2) + = 0 


expressing the new generator in terms of the old ones, the matrix || X;,,|| 
is again transformed into an ¢-equivalent one. For the matrix |X all is 
merely bordered by a new row and column, where the elements of the new 
column are all zero except the one in the new row which is unity. 

(iii) If we replace the generator s by another operation ¢ of index unity 
such that the operations 


(9.18) t, Gi, G2, °** Om 


also generate the group A we leave the matrix |x ij|| invariant. For suppose 
we write 


ypa=N+a-MN 


to denote transformation through this new operation ¢. Then, since s and t 
are both of weight unity, it must be possible to write 
(9.19) s=t+ a) 


where ¢(y, a) is of index zero and, therefore, expressible in the abridged 
notation. Therefore, we have 


294 J. W. ALEXANDER [April 


(9.20) y(ota-—). 


But suppose we make the substitution (9.20) in equation (9.17). Since, in 
these last equations, addition is to be regarded as commutative, the substi- 
tution (9.20) produces the same effect as the substitution x=y. Therefore, 
the matrix || X;,,|| is left invariant, except for a change of notation. 

With the above facts establisned, the proof of the theorem is immediate. 
Let the generators of the group B, written in canonical form, be 


(9.21) Os, * 5 Bas 

and the defining relations, 

(9.22) dy, 6) =0. 

Then, since the groups A and B are directly isomorphic, we may express the 
isomorphism either by the identities 

(9.23) t= s+ $(x, a) [yo = x(@ + 6 — 9)], 
(9.24) bs = oi(x, 

or by the inverted identities 

(9.25) s=t+Wy, d) [xa, +a—y)], 
(9.26) a; = 6). 


Now, starting with the generators (9.15) and identities (9.16), let us ad- 
join successively the generators b; of the group B along with the correspond- 
ing relations (9.24) expressing these last in terms of the generators of the 
group A. Moreover, let us next adjoin successively relations (9.22) and 
(9.26), in which we think of y as expressed in terms of 


@i,*** Gm, bi, bn 


by means of relation (9.23). We finally obtain the group A determined by 
the generators 


Gi, °** Gm, bi, On 


with the defining identities (9.16), (9.22), (9.24) and (9.26). Moreover, 
by (ii) and (i), the matrix ||X;;|| corresponding to this new mode of defi- 
nition is e-equivalent to the matrix || X;;||. 

By a similar argument, we may define the group B by means of the 
generators 

t, i, *** Om, bi, On 

along with the same defining identities (9.16), (9.24), and (9.26), where the 
matrix ||Y;;|| corresponding to the new mode of definition is ¢-equivalent 


1928] TOPOLOGICAL INVARIANTS OF KNOTS 295 


to the matrix ||Y;;||._ Finally, by (iii) the matrix || X;;|| must be identical 
with the matrix ||Y;;|| except for a change of notation. Therefore, the 
matrices || X,,;|| and || must be e-equivalent. 


If two indexed groups A and B are inversely equivalent, the matrix || X;;|| 
associated with the group A goes over into a matrix which is ¢-equivalent to the 
matrix ||Y;;\| associated with the group B if we make the change of marks 


The proof is similar to that of the previous theorem. The one essential 
difference is that in place of relation (9.19) we must write 


s= —t+ (y, a) ; 
whence, in place of (9.20), we have 


The matrices ||X;;|| and ||X;{;|| corresponding to two different ways of 
defining an indexed group A are €-equivalent. Morcover, the effect on the matrix 
of changing the signs of the indices of all the operations of an indexed group A 
is that produced by the substitution x’ =x". 


This is, of course, a consequence of the two previous theorems, when A 
- and B are regarded as symbols for the same group. 

10. Application to knots. The group R of a knot may evidently be 
thought of as an indexed group, for with cach curve Cydetermining an 
operation 7 of the group there is associated a certain integer measuring the 
number of times (in the algebraical sense) that the curve C winds around or 
loops the knot. This integer will be defined as the index of the operation r. 
With proper conventions as to what shall be the positive sense of winding 
around the knot, the index of an operation r; of the group will evidently be 
equal to the index of the region r; diminished by the index of the region 7, or, 
if we choose the additive constant at our disposal so as to make the index 
of the last named region equal to zero, the index of the operation r; will 
simply be the index of the region r;. 

Now, let us choose our notation so that 7) and 7,4; are two regions with 
consecutive indices 0 and 1. Moreover, let us denote by #; the index of a 
general region r;. Then, if we make the substitution 


(10.1) r= pstr 


= 
the new set of generators 


296 J. W. ALEXANDER [April 


will evidently be in canonical form, for the index of the first one will be unity 
and the indices of the others zero. Let us examine the form that the defining 
relations 


(10.2) 


of the group R take when written in the abridged notation. If an equation 
(10.2) corresponds to a right handed crossing point of index p it may be 
expressed as 


pstri —ri —(p—1)s=0 
in terms of the canonical generators. Therefore, if we put 
=s+r'—s 
it reduces, after we leave off the primes, to 


A similar reduction leading to the same final result may be made if equations 
(10.2) correspond to a left-handed crossing point. Therefore, 


The equations of the diagram taken in conjunction with two more equations 
of the form 


ro = 0, N41 = 0, 


corresponding to regions with consecutive indices are the equations of the group 
of the knot written in abridged notation. 


In other words, the matrix Mo, §5, obtained by striking out two columns 
of consecutive indices from the matrix M is simply the matrix ||X;,,||, §9, 
of the group equations written in abridged notation. This gives us a second 
proof of the ¢-invariantive character of the matrix M from which most of 
the other theorems in §§4 and 5 are immediately deducible. 

11. Links. A link will be defined as a figure composed of the vertices and 
. sensed edges of a finite number of non-intersecting knots. The most obvious 
link invariant is the number of knots into which the link may be resolved. 
We shall call this number the multiplicity pu of the link. A knot will thus be a 
link of multiplicity one. Evidently, the entire discussion up to this point 
applies not only to knots but to links of arbitrary multiplicities. That is 
to say, with every link there will be associated a matrix M having the 
same e-invariantive significance as for the case of a knot, an invariant 
polynomial A(x), and so on. In the case of a link of higher multiplicity a 
broader generalization is, however, possible, as we shall now indicate. 


1928] TOPOLOGICAL INVARIANTS OF KNOTS 297 


Let L be a link of multiplicity 1 made up of the elements of yu different 
knots 


(11.1) Ki, ‘Ke, ---, K,. 


Then, at each crossing point c; of the diagram of the link L the lower branch 
will belong to some knot K, of system (11.1), the upper branch to some knot 
Ky which may, or may not be the same as the knot K,. To the crossing point 
c; we shall attach the number a associated with the knot K, determined by 
the lower branch through the point. Moreover, we shall replace the equation 
(3.3) of the diagram associated with the crossing point c; by a similar equation 


(11.2) — tri — tm = 0, 


where the coefficient x of the original equation has been replaced everywhere 
in the equation by the coefficient x.. The matrix M, of the system of equa- 
tions (11.2) determined by the various crossing points c; will thus be an 
array in the marks 0, +1, and +x, (a=1,2,---,,). It wiil reduce to the 
matrix M, as defined in §4, if we replace all of the marks x, by one single 
mark x. 

Two matrices M, will be said to be e-equivalent if it is possible to trans- 
form one of them into the other by means of a finite number of elementary 
operations (a), (8), (vy), (6), §5, in combination with a finite number of 
operations of the following type: 

(€) Multiplication or division of a row (column) by xz. 

This last operation is, of course, the natural generalization of the operation 
of the same name defined in §4 for the case where we have a simple mark x. 

We now have the following broad generalization of one of the theorems 

of §5: 


If two diagrams represent links of the same type their matrices M, are 
e-equivalent. 


The theorem may be verified directly by the elementary method of §6. 
It may also be derived by group theoretical considerations analogous to those 
developed in §§9 and 10. We shall indicate, briefly, the second method of 
proof. 

The group A of a link of multiplicity p is a u-tuply indexed group; for with 
each operation a of the group there may be associated a composite index 


(11.3) (pi, Pe ***s Pu) 


such that the number #; is the linkage number of a curve determining the 
operation with the ith component knot K; of the group. By a process 


298 J. W. ALEXANDER [April 


analogous to the one used in finding the highest common factor of a set of 
integers, it is easy to reduce any set of generators of the group A to the 
canonical form 


(11.4) Si, 52, °° * 5 Suy *** » Am, 

where the index of each generator s; of the first type is composed of zeros 
except for the number /; which is one, and where the index of each generator 
a; of the second type is composed exclusively of zeros. The identical relations 
in the generators will be certain linear expressions of the form 

(11.5) Ci(S1, °° * G1, Gm) =O. 

We shall denote by A’ the group determined by the relations (11.5) together 
with all additional relations of the form 

(11.6) Sots; = Sj + 


expressing that any two generators (11.4) of the first type are commutative. 
Moreover, we shall denote by A* the self conjugate subgroup of the group 
A’ consisting of all operations of the group A’ of index (0, 0,---,0). Let 
us now use the abridged notation 


(11.7) =Ssta— 
Then, by an obvious extension of the argument used in §9, we may show that 


the operations of the group A®* are precisely the ones which may be repre- 
sented by sums of the form 


Finally, if we impose a further set of relations making any two operations of 
the group A* commutative, we obtain a group A.* consisting of all operations 
which may be represented in the form 


(11.8) > Xai, 


where X; is a polynomial in the marks x, x2, - - - , x,. In the last expression, 
addition is, of course, to be regarded as commutative. The group A.* will 
be the one determined by the generators 

@2,°°** Om 
together with the identities 


(11.9) > = 0 


to which the identities (11.5) reduce when expressed in the abridged notation 
and when addition is regarded as commutative. 


4 


1928] TOPOLOGICAL INVARIANTS OF KNOTS 299 


By the methods of §10 it may be verified without difficulty that the 
matrix of the group A of a link is the matrix || X,,|| of the coefficients in (11.8). 
To obtain the direct generalization of the theory developed for knots, we 
must set 
A certain number of easily calculable invariants are obtainable by setting 
all but one of the marks x; equal to unity. 

12. Miscellaneous theorems. Let the number of regions of a link dia- 
gram bev+2. Then, if the curve of the diagram is a connected point set the 
number of crossing points must be v, just as the special case of a knot dia- 
gram. If the curve of the diagram is not a connected point set but is made 
up of «+1 connected pieces, the number of crossing points is only v—x. 
We must then adjoin to the equations of the diagram a set of x equations 
of the form 0=0 so that the matrices M and N, §4, shall have two less rows 
than columns. For we want the matrix M,) from which we compute the 
invariant A(x) to be a square array of order v. If the number « is greater than 
unity the invariant A(x) evidently vanishes. 

Several theorems are to be obtained by observing that when we set x 
equal to 1 in the matrix M(x)=M the form of the resulting matrix M(1) 
will be independent of how the corners at the crossing points are dotted and, 
’ therefore, independent of which branch through a crossing point is regarded 
as the one passing under the other. Suppose, for example, we start with the 
theorem that the invariant A(x) of an unknotted knot is unity, as may be 
verified, at once, by direct calculation. Then, as an immediate consequence, 
we may obtain to the following theorem about knots in general: 


The sum of the coefficients of the invariant A(x) of a knot is always numer- 
ically equal to unity. 


For, in the notation of §4, we have 


(12.1) A(x) =+ (r41)(X), 


whence, the sum of the coefficients of the invariant A(x) must be given by 


A(1) =+ 


Now, by changing upper into lower branches at a suitably chosen set of 
crossing points we may always “unknot” the knot. For one obvious way of 
doing this is to reverse crossings in such a manner that if we start at a 
specified point P of the curve of the diagram and describe the curve in the 
positive sense we never pass through a crossing point along an upper branch 
without previously having passed through it along a lower one. But, as we 


300 J. W. ALEXANDER [April 


have already observed, such a reversal of crossings leaves invariant the 
value of the determinant A,,,4:(1). Therefore, since A(x) is unity for an 
unknotted knot, we have 


1 = + (1), 


which proves the theorem. 


The multiplicity p of a link L is equal to the number of zero elementary 
factors of the matrix M(1) obtained by setting x equal to 1 in the matrix M. 


For, by an elementary calculation, we verify that the theorem is true 
when the link L consists of u unknotted and non-interlinking curves. 

It should be remembered that if we assign a constant integral value c to x 
in the matrix M and then derive the elementary factors of the matrix M(c), 
regarding the latter as a matrix in integer elements, we do not necessarily 
get the same result as if we derived the elementary factors of the matrix 
M(x) regarded as a matrix in polynomial elements and then substituted the 
value c for x in these factors. For in calculating the factors of the matrix 
M(x) we have to allow the operation of adding to one row (column) a rational 
multiple of another row (column), which operation is not allowed in calculat- 
ing the factors of the matrix M(c) unless the rational multiple is also integral. 
It may, therefore, be worth while noting the following theorem: 


The elementary factors of the matrix M(x) (c integral) that are not of the 
form +c? are all link invariants to within a multiple of c. 


For the matrix operations (a), (8), and (y) of §4 leave the elementary 
factors of M(c) unaltered, the operation (5) merely adds or takes away a 
unit factor, and the operation (€) merely multiplies or divides one factor 
by c. 

Let v+2 be the number of regions of a link diagram and p the number of 
distinct values taken on by the indices of the various crossing points (correspond- 
ing to any one way of assigning indices). Then the degree of the polynomial 
A(x) never exceeds v—p. 


For, in the notation of §4, the degree of the polynomial A(x) never exceeds 
that of the polynomial A,:p41(«). Moreover, by (4.6) we have 
Aoi(x) = + 


But the degree of Aoi(x) cannot exceed v since this expression is a v-rowed 
determinant with elements that are linear in x. Therefore, the degree of 
Asip+1)(x), and consequently also of A(x), never exceeds v—p. 


1928] TOPOLOGICAL INVARIANTS OF KNOTS 301 


If K is a composite knot, §3, made up of the factors K, and Kz, the in- 
variant A(x) of the knot K is equal to the product of the corresponding invariants 
of the knots K, and Kz. 


A composite knot is one which may be deformed in such a way that its 
diagram will be of the sort considered in the last paragraph of §3, where two 
edges ¢, and ¢, appear on the boundaries of two different regions 7; and rz of 
the diagram. In the notation of §3, let a+: and a+72 be the curves of the 
diagram of the two factors K; and Kz of K. Moreover, let A,;-41)(x) be the 
determinant of the knot K after elimination from its matrix M of the two 
redundant columns corresponding to the regions 7; and rz respectively. Thus, 
clearly the determinant A,;,+:)(x) is equal to the product of the two similar 
determinants Aj(r41(x) and corresponding to the two factors 
K, and Kz; for the determinant A,;,41)(x) is related to these other two in the 
manner illustrated schematically by 


0 
= | 


| 
The theorem therefore follows, at once. 

We shall bring these miscellaneous remarks to a close by obtaining a 
relation between the polynomial invariants A(x) of three closely associated 
‘links. Consider a link diagram Z’ with a right handed crossing point of 
index p (Fig. 2a). By changing this one crossing point into a left handed one 
(Fig. 2b) we obtain a new diagram Z’’. Moreover, by cutting the two 
branches at the crossing point, separating them slightly, and rejoining them 
so as to unite into one the two regions of index p incident to the crossing 
point (Fig. 2c) we obtain a third diagram Z. Now, suppose we form the 
matrix N (§4) of the diagram Z’ and arrange the rows and columns in such 
an order that the first row corresponds to the crossing point c and that the 
first four columns correspond to the four regions incident to the point c 
and represented in Fig. 2a by the first, second, third and fourth quadrants 
respectively. The first row of the matrix N’ will, thus, start with the elements 
x, x, 1, 1 followed by zeros. To obtain the corresponding matrix N” of the 
diagram Z we shall merely have to interchange the first and third elements 
in the first row of the matrix N’; to obtain the corresponding matrix N of the 
diagram Z we shall merely have to add the first column of the matrix N’ 
to the third and then strike out the first row and first column. Now, let 
As (p+, and be the determinants of the matrices obtained by 
striking out the first two columns of N’, N’’, and N, respectively. Then, 
clearly, the determinant Aj :p;1) will be the principal minor of both the 
determinants Avto+n and Aj’p41. Let I be the minor of the second element 


j 
| 
| 


302 J. W. ALEXANDER [April 


in the first row of Aj¢p41) (and of Aj’p4»). Then if we expand each of the 
determinants Aj,p41) and Aj’;»,1) in terms of the elements of its first row 
and their minors, we find 

= Apiptty) — T, = — T, 

whence, 


which is the relation we had in mind to establish. 

The argument needs to be slightly modified if the two corners of index p 
at the crossing point c belong to the same region of the diagram. In this 
case, however, the curve of the diagram Z must be a disconnected point set, 
whence, 

Apip+1) = 


Moreover, we obviously have 


7 
Apcp+1) = Apcp+1), 


so that relation (12.2) continues to hold. 

13. n-sheeted spreads. Further invariants of the matrix M are to be 
obtained by regarding each element as the symbol for a certain square array 
of order , where the connection between the symbols and the arrays which 
they represent is as follows: 

(i) 0 is the symbol for an array composed entirely of zeros. 

(ii) 1 is the symbol for an array with 1’s along the main diagonal and 
0’s elsewhere. 

(iii) x is the symbol for an array obtained from the array 1 either by 
permuting the columns cyclically so that the first column goes into the 
second or by permuting the rows cyclically so that the second row goes into 
the first; the effect of either permutation is the same. x? is the symbol for 
the array obtained from the array 1 by making p successive permutations 
of the type just described. 

Now, if we replace each element of the matrix M by the array which it 
symbolizes we obtain a new array M* of vm rows and (v+2)n columns, 
made up of integer elements. 


The elementary factors of the array M* that differ from unity are link 
invariants. 


For to an elementary operation (a), (8), (7), or (6) on the matrix M there 
evidently corresponds an operation on the matrix M* which may be resolved 
inito elementary operations. Moreover, to an extended operation (€) on 


1928] TOPOLOGICAL INVARIANTS OF KNOTS 303 


the matrix M there merely corresponds a cyclical permutation of a block 
of n rows (columns) of the matrix M*, which may again be resolved into ele- 
mentary operations. This proves the theorem. 

Corresponding to a pair of redundant columns of the matrix M there 
will be two sets of redundant columns of the matrix M* composed of # 
columns each. We may, therefore, always replace the matrix M* by a square 
matrix M; of order vn. 

The elementary factors of the array M," have an interesting geometrical 
significance. If the link from which we start is a knot, the number of zero 
factors is equal to the connectivity numbers, 


(Pi — 1) = (P2— 1), 


of an n-sheeted Riemann spread S” (the 3-dimensional generalization of an 
n-sheeted Riemann surface) with the knot as branch curve (generalized 
branch point). The divisors that are different from zero and unity are the 
coefficients of torsion of the spread S".* This may all be proved very easily 
by making a suitable cellular subdivision of the spread S", as we shall now 
indicate. t 

Let K be the knot under consideration and S the space containing it, 
where to simplify matters, we shall suppose that the space S closes up to a 
single point at infinity. Then, the first step will be to cut up the space S into 
cells, in the following manner. Wherever one branch of the knot appears to 
pass behind another, we shall join the upper branch to the lower one by a 
segment c/. The ends of the segment c/ will be the vertices, or 0-cells, of 
the subdivision; the segments c/ themselves, together with the arcs a! 
into which the ends of the segments c/ subdivide the knot will be the 1-cells. 
Corresponding to each region r; of the diagram we shall construct a 2-cell 
r{ of which r; will be the projection, bounded by the appropriate arcs a/ 
and c/. The residual part of the space S will then consist of a pair of 3-cells. 
Thus, to sum up, the subdivision = that we have just described will consist 
of 2v vertices, 3v edges, v+2 2-cells, and 2 3-cells. Corresponding to the 
subdivision = of the space S there will be a subdivision >" of the spread S” 


* In a paper read before the National Academy in November, 1920, cf. Veblen, Cambridge 
Colloquium Lectures (1922), Analysis Situs, p. 150, I made the observation that the topological 
invariants of the 7-sheeted spreads associated with a knot would be invariants of the knot itself, and 
showed by actual calculation that these invariants could be used to distinguish between a number of 
the more elementary knots. Later, these same invariants were discovered independently, by F. 
Reidemeister, Knoten und Gruppen, Abhandiungen aus dem Mathematischen Seminar der Ham- 
burgischen Universitat, 1926, pp. 7-23. See also a paper by Alexander and Briggs. On types of knotted 
curves, Annals of Mathematics, (2), vol. 28 (1927), pp. 563-586. 

t Cf. Alexander and Briggs, loc. cit. 


Fy 


304 J. W. ALEXANDER [April 


such that each cell of the subdivision = which is not composed of points of 
the knot will be covered by n cells of the subdivision 2", one in each sheet of 
the spread S”, and such that each cell of the subdivision 2 which is composed 
of points of the knot will be covered by just one cell of the subdivision 2”. 
Along the knot, of course, the m sheets of the spread S” merge into one. 

We may simplify the subdivision =" somewhat by amalgamating all but 
one of the 2v edges covering the edges a}, along with their end points, into 
one single 0-cell; or, if we prefer, by treating this group of 0-cells and 1-cells 
as a single generalized vertex A. The remaining 1-cells of the subdivision 
will then represent closed 1-circuits beginning and ending at the vertex A, 
while the boundaries of the various 2-cells will give the relations of bounding 
among these circuits. Clearly the number of 1-circuits will be »n+1, as there 
will be » circuits 


(13.1) Gj=1,2,---,m), 


corresponding to each segment c/ , together with one additional circuit ¢ cor- 
responding to the residual arc of the branch system that has not been 
amalgamated into the generalized vertex A. Moreover, there will be v»n+2 
relations among these circuits corresponding to the y»n+2 2-cells 


(13.2) rij (j=1,2,---, 


covering the 2-cells r/ of the subdivision 2. 

Now, it is easy to verify that the matrix M* is precisely the one exhibiting 
the incidence relations between the 1-circuits (13.1) and 2-cells (13.2). 
Therefore, to obtain a matrix exhibiting the incidence relations between all 
the 1-circuits and 2-cells we have only to adjoin to the matrix M* a new row 
corresponding to the remaining 1-circuit c. As the 1-circuit c; is on the boun- 
dary of two blocks of m 2-cells each, corresponding to a pair of contiguous 
regions ro and r,,; of the diagram, the added row will consist of a block of 
n 1’s, a block of m —1’s, and zeros. The elementary factors of the matrix with 
the added row will be the ones that determine the connectivity numbers 
and coefficients of torsion of the spread S”. But this last matrix is evidently 
equivalent to the matrix M,’, for the 2m columns having the elements +1 
in the last row correspond to the redundant columns of the matrix M"*; 
‘hence, by adding to these columns suitable linear combinations of the re- 
maining ym ones we may make all their elements zero except the ones in 
the last row. Thus, the geometrical interpretation of the divisors of the 
matrix M, is established. 

Since the matrix with the added row may be transformed into one such 
that in certain columns all the elements will be zeros with the exception of 


1928] TOPOLOGICAL INVARIANTS OF KNOTS 305 


an element 1 in the last column, it follows that the 1-circuit c must be a 
bounding curve, 

c~m~0; 
for each column determines a relation of bounding among the 1-circuits de- 
termined by the rows. But the 1-circuit c, when we include the generalized 
point A which really forms a part of it, is simply the branch curve of the 
spread S* itself. Hence we have the theoremf 


The branch curve of the n-sheeted spread determined by a knot K is always a 
bounding curve of the spread. 


The geometrical interpretation of the factors of the matrix M* is not 
quite so satisfactory for the case of a link of multiplicity greater than unity. 
However, by a similar argument to the one made above, it is easy to show 
that these factors give the connectivity numbers and coefficients of torsion 
of the spread S* when we treat the entire branch system of S* as if it were a 
single generalized point. 

14. Tabulation of A(x). At the end of the paper referred to in the last 
footnote a chart has been drawn up showing diagrams of the eighty-four 
knots of nine or less crossings listed as distinct by Tait and Kirkman; also 

a table giving the torsion numbers of the 2- and 3-sheeted Riemann spreads 


Sie 


1-—5+9 3-749 One 1—5+7-7 

1—6+9 3-—8+11 76 1—5+9-9 

520 1-—7+11 3—124+17 9200 1—5+9-—11 
61a 1-—7+13 3—12+19 9224 1—5+10—11 
2-—3+3 3—14+421 _ 128 
720 2—4+5 4-—8+9 
816 2—5+5 4—9-+11 9260 1—5+11—13 
74a 2—6+7 4—10+13 1—5+11-—15 
2—6+9 4—11+15 _ 16+ 
82a 2—74+9 5—14+19 
2—7+11 1—1+0+1 1—5+12-—17 
2—8+11+ 1—i+i1-1 1—5+13-—17 
Sie 1—3+2-1 9320 1—6+14-—17 
2—9+13 1—34+3-—3 9330 1—6+14-—19 
2-—9+15 1—3+4-—5 1-—6+16—23 
624 2—10+15 1—3+5-—5 9:00 1—7+18—23 
630 2—10+17 1-—3+5-7 2-—34+3-—3 

2—11+17 1—3+6-7 %a 2—4+5-—5 

x60 2-—11+19 1-—4+6-—5 2—4+6-—7 

Dain 3—5+5 1—44+ 8-9 O60 2—54+8-—9 

76a 3-—6+7 1—4+ 8-11 10 1—1+1-1+1 


| 
wom 


| 


+ AandB, loc cit. 


| 


306 J. W. ALEXANDER 


associated with the knots. We list below the values of the polynomial A(z) 
for these same knots. It appears that the polynomial A(x) of a knot is always 
of even degree and that its coefficients are arranged symmetrically with 
reference to the middle one. Therefore, to reduce the space occupied by the 
table we have merely indicated the values of the coefficients of A(x) up to, 
and including, the middle one. To illustrate how the table is to be used, 
let us turn, for example, to the entry 


| 1 — 6 + 14 — 


This indicates that an alternating knot of nine crossings listed as the thirty- 
third in the chart referred to above has the polynomial invariant 


A(x) = 1 — 6x + 14x? — 19x° + 1424 — 625 + x®. 


In the table, six repetitions of the same polynomial appear. In the three cases 
marked with (*) the two paired knots may be distinguished with the aid of 
other invariants of the matrix M, such, for example, as the coefficients 
of torsion of the associated Riemann spaces. In the three cases marked with 
(+) the two paired knots have e-equivalent matrices M and, therefore, 
cannot be distinguished (assuming that they actually are distinct, which 
Tait never really proves) by the methods of this paper. 


PRINCETON UNIVERSITY, 
PRINCETON, N. J. 


ON THE EXPANSION OF ANALYTIC FUNCTIONS IN 
SERIES OF POLYNOMIALS AND IN SERIES OF 
OTHER ANALYTIC FUNCTIONS* 


BY 
J. L. WALSH 


‘1. Introduction. The present paper is substantially a continuation of 
a previous paper in which polynomial developments of an arbitrary analytic 
function were considered, culminating in three theorems. The first of these 
theorems is in essence a modification and completion of a result due to 
Birkhoff, a generalization of Taylor’s development about the origin in the 
plane of the complex variable x:t 


THEOREM I. Let the functions 
pox), p(x), po(x), 


be analytic for |x |<1+¢, and such that on and within the circle y': |x|=1+¢, 
we have 


(1) | p(x) Se (k =0,1,2,---), 
where the series > «2 converges to a sum less than unity, and where the series 


die converges. Then there exists a set of functions P,(x) continuous for 
|x| 21, analytic for |x|>1,§ zero at infinity, and such that 


f0, ix k, 
1, k, 

If F(x) is any function integrable and with an integrable square (in the sense of 

Lebesgue), then the two series 


1 
(3) a, = 
7 


k=0 


(4) ce f F(x) Pi(x)dz, 


have on y (and hencell in the closed region |x| <1) essentially the same con- 


* Presented to the Society, September 9, 1927; received by the editors in May, 1927. 

1 These Transactions, vol. 26 (1924), pp. 155-170. We shall refer to this paper as J. 

1, p. 159. 

§ See below, § 2. 

|| A convergent series of constant terms dominates the term-by-term difference of series (3) 
and (4) for |x|=1 and hence for |x|<1. 


307 


= 
A 


308 J. L. WALSH [April 


vergence properties, in the sense that their term-by-term difference approaches 
uniformly and absolutely the sum zero. In particular if F(x) is continuous for 
|x| <1, analytic for |x| <1, and satisfies a Lipschitz condition on +, then the 
series (4) converges uniformly to the sum F(x) in the closed region |x| <1. 


In I this theorem was applied, after conformal transformation, to obtain 
the two other theorems mentioned, the first on the expansion of an analytic 
function in terms of polynomials, the second including the analogue of the 
Laurent series. In the present paper we treat (Part A) more in detail the 
analogy between the two series (3) and (4), considering arbitrary series of 
type (4), the analogue of Abel’s theorem and its converse, convergence 
properties on circles other than y, and the uniqueness of expansions. In 
Part B we apply these results to the case of polynomials belonging to a given 
region, and collect the main results of the paper in Theorem IX. We con- 
sider in particular the expansion of a discontinuous function, in Theorem XI. 
It is found that under certain conditions Gibbs’s phenomenon occurs, 
precisely as for Fourier’s series. In Part C we study the use of polynomial 
expansions in connection with multiply-connected regions, obtaining certain 
results on the boundary values of analytic functions. 


A. SERIES OF ANALYTIC FUNCTIONS 


2. Modification of proof of Theorem I. The proof of Theorem I given 
in I is needlessly complicated. It is perhaps worth while to present in some 
detail a modification, for we shall need later certain inequalities obtained. 

We apply the Lemma used in I, choosing the interval 0<¢<27 as 
the circle y: |x|=1, using x=e* ony. The functions {u,(¢)} and {U,(¢)} 
are taken (modifying the argument of I, pp. 162-3) simply as 


(5) u,(o) = U,(¢) = Pal) (m = 0,1,2,---) 
Thus we have immediately 
(6) Cok = f (Un — Un) 
0 
(7) S f (Un — — 
k=0 
or 1 dx 


dx 


1 
= (Pa(x) — x 


1928] EXPANSIONS OF ANALYTIC FUNCTIONS 309 


En 
8 | 
(8) | car | 
The function V;(¢) of I is therefore given by the equation* 
(9) = Do(des + 
s=0 


We have, however, the inequalities 


(10) | dis + | s p? == > | Cij = > | 
j=0 i,jm 
(11) | dis | « | des + + > | cel 
i=0 i=0 i=0 


We define the functions P;(x) so as to make the two following series identical: 


ce = 


k=0 
(12) = f 
k=0 7 
That is, we set 
by - do 
We have of course 
1 
x=e%, dx = ie*dd, =—- 
1x 


It follows, then, directly from (9) and (5) that the functions P;(x) are con- 
tinuous for |x|21, analytic for |x|>1, and zero at infinity. Moreover, 
if the series }\& is dominated by a convergent geometric series, then the 
functions P;(x) are analytic likewise for |x|=1.t In fact, the series (11) is 
also dominated by a convergent geometric series, by virtue of (7): 


* See Walsh, these Transactions, vol. 22 (1921), p. 234, where the Lemma used in I is proved, 
and inequalities (10) and (11) likewise derived. 

+ The writer withdraws the statement in I, pp. 159, 163, that the functions P;(x) are analytic 
on y, when the e¢ are not further restricted. Thus in the proof of I, Theorem I, we choose the ¢ 
so that the series) .¢, is dominated by a convergent geometric series. 


2 
ils 


310 J. L. WALSH [April 


and by virtue of (8). Then the series (9), when conjugate complex quantities 
are taken, is a Laurent series whose coefficients are dominated by a con- 
vergent geometric series, so V;(x), and hence also P;(x), is analytic for 
|x | =1. 

3. Development of continuous functions on y. If the function F(x) is 
continuous for |x| <1 and analytic for |x| <1, then the Taylor development 
of F(x) about the origin converges, when summed by the method of Cesaro, 
uniformly for |«|<1 to the value F(x). In fact the Taylor development is 
on y precisely the Fourier development of F(x), which when summed as 
described converges uniformly on y to the value F(x), hence uniformly on 
and within y to the value F(x). The Taylor series itself converges for |x| <1, 
by the usual inequalities for the coefficients of a power series, and hence con- 
verges to the value F(x), because in case of a convergent series the sum as- 
signed by the Cesdro summation process is the sum of the series. 

Application of this remark yields, if we remember that series (3) and (4) 
have essentially the same convergence properties in the entire closed region 
|x| <1, 


Tueorem II. If F(x) is continuous for |x|<1 and analytic for |x| <1, 
then the series (4) converges uniformly to the sum F(x) in any closed region 
|x |< |xo|<1, and the sequence formed from (4) by the Cesdro summation 
method converges uniformly for |x| <1, to the sum F(x). 


We turn now from the consideration of series (4) arising from functions 
F(x) given on y, to the consideration of series of the form 


(13) 
k=0 


with arbitrary coefficients g,. 

4. Convergence of arbitrary series (13). If no further restriction is placed 
on the quantities ¢, than in Theorem I, it is not true that the convergence 
of (13) for « =x» enables us to conclude the convergence of (13) for all values 
of x such that |x |< |xo|. Let us set, in fact, 


po(x) 1, px(x) = — k> 0, 


where 6 is positive and so small that for ¢,=0, «=6*, k>0, the required 
conditions on « are fulfilled. Then every series (13) converges for x=6, 
yet need not converge for every x such that |x| <6. Indeed, under this same 
definition for p,(x), every series 


x) 
k=0 


1928] EXPANSIONS OF ANALYTIC FUNCTIONS 311 


converges whenever x=wé, w”” =1, m being integral. This series converges 
therefore on a point set everywhere dense on the circle |x| =6, yet does not 
necessarily converge for x =0. 

Under suitable restrictions on the «¢, we can prove the result for series 
(13) which is analogous to the well known result for Taylor’s series: 


TueoreM III. If the series (13) converges for x=xo, where |xo|<S1+¢e, 
and if the series >-f-0 €xt* converges for every (finite) value of t, then the series 
(13) converges for all values of x such that |\x|<|xo|, and the convergence is 
uniform for all values of x such that |x| < |x:| < |xo]. 


We naturally assume x)»~0; the contrary case is without content. We 
prove actually a stronger theorem than that stated, for we use not the con- 
vergence of (13) for x =x» but merely the boundedness of the terms of the 
series. 

The inequality 

| — S 
gives at once the double inequality 


| | xo | | | 


1 


But we have lim,.. |=0, and hence limy.. |pe(xo)|/|xo | =1. 
Therefore if the quantities g.~,(xo) are uniformly bounded, so also are the 
quantities g.%» , and conversely. From the boundedness of the gi%o : 


| gexot| < M, 
follows the inequality 
M 
| | 


We now make use of the inequality 


| | = |* + &, 


k=0 k=0 k=0 


The first series on the right converges uniformly for |x| < |x,|, since the 
individual terms of that series are bounded for x=». The second series 
on the right, which does not contain x, converges. Hence the series on the 
left converges uniformly for |x |< |x:|, and Theorem III is established. 

The argument just given includes practically a proof of the fact that for 
points x such that |x |<1+e, the two series 


or 
20 20 


J. L. WALSH 


Dogex* and 

k=0 k=0 
have the same points of convergence, of absolute convergence, of divergence, 
of summability, and the same regions (or point sets) of uniform convergence. 
The only exception here occurs if the Taylor series converges only at the 
pointx=0. For if there exists a single value of x, say x90, of the kind 
considered so that either of these two series converges, we have 


M 


It follows that for all values of x, |x| <1+€, we have uniformly 


M 
| gepe(x) — gex*| S 

| xo" | 
The series }>;_, Mex/|xo"| converges and does not contain x, so the state- 
ment is proved. 

There is no exceptional case here, even if the Taylor series converges 
only for x=0, provided p,(0)=0, k=1, 2, -- +; compare Theorems VII 
and IX below. 

We can state now two interesting results of this discussion; the first 
result gives the radius of convergence in terms of the coefficients. Here and 
in the remainder of the paper we assume, unless otherwise stated, the series 
> «t* to converge for all values of ¢. 


THEOREM IV. If we set 
1 
lim | gx = 
p 


then if p<1+€, series (13) converges for |x| <p and diverges for 1+¢> |x|>p; 
if p>1+€, series (13) converges for |x 


The generalized theorem of Abel yields its analogue for the series (13): 


THeEorEM V. [If the series (13) converges for the value x=x,, where |x;| 
<1+¢«, then this series converges uniformly in the closed region bounded by 
two arbitrary line segments terminating at the point x, and by an arc of the circle 
|x| = |x2|< |x|. 

If (13) converges uniformly on an arc x2 of the circle |x|=|x,|, then 
this series converges uniformly in the closed region bounded by this circular arc, 
by two arbitrary line segments lying in the region |x|<|x:| and terminated 
respectively by x; and x2, and by an arc of a circle |x| = |x3| < |x1|. 


312 [April 
| 


1928] EXPANSIONS OF ANALYTIC FUNCTIONS 313 


5. Analogue of Tauber’s theorem. For series (13) we can give likewise 
a converse (not exact) of Abel’s Theorem: 


THEOREM VI. If the series (13) is such that limy..« gi/k =0, and if for radial 
approach* to the point x, on y we have 


lim f(x) = g, 
where f(x) denotes the value of the (convergent for |x| <1) series (13), then we 
have also 


= g. 


k=O 
If we define the numbers }; by the relations 


by = 
and then set 


(14) = be + Condo + Cindi + + --- (k = 0,1,2,---), 


we find by the use of the Schwarz inequality in conjunction with (8), 


(1 + «)* 


By our hypothesis on the g;, the series >. (_, |b; |? converges. The a, defined 
by (14) are such that lime... a:/k =0, by (15). These numbers a; are such 
that >°7_, |a.|? converges, hence, by the Riesz-Fischer theorem, there 
exists a function F(x) defined on ¥, integrable and with an integrable square, 
whose coefficients with respect to the normal orthogonal system {u,} used 
in §2 are the numbers a;. The numbers };, subjected to the condition that 
Dro |b: |? should converge, are uniquely determined by (14),¢ and hence 
the numbers g; =),/(27)'/? are the coefficients of F(x) in its expansion (4). 
Thus (3) and (13) have essentially the same convergence properties in and on 


(15) S 


By Tauber’s theorem,} we have, since limi. =0, 


k=0 


* The result holds also for approach in various other ways. See for example Landau, Ergebnisse 
der Funktionentheorie, Berlin, 1916, Kap. III. Compare our application of Theorem VI in Theorem 
IX. 

t See the reference in § 2 to the proof of the Lemma used in I. 

t See Landau, loc. cit. 


314 J. L. WALSH 


and thus we have as well 


gepe( x1) = 


k=O 


and the theorem is established. 

Many other results similar to Theorem VI, analogues of results for 
Taylor’s series, might be established. We choose, however, to treat the 
equivalence of series (3) and (4) on circles other than y. 

6. Properties of series on circles other than y. We prove the following 
theorem: 


THEOREM VII. Jf pi(x) has at least a k-fold root at the origin, then an 
arbitrary function F(x) integrable and with an integrable square on the circle 
T: |x|=y~<1+€ can be formally expanded in a series of type (4), where the 
coefficients are found by integration over T. This series (4) and the Taylor de- 
velopment (formal) of F(x) have on and within the circle T the same convergence 
properties, in the sense that their term-by-term difference converges absolutely 
and uniformly on T and in its interior to the sum zero. 


We perform the substitution z=x/u, x=yz, so as to apply Theorem I 
directly to the unit circle in the z-plane. We require for application of 
Theorem I the inequality 
(16) < for all | Site’, 

Expansion of the function F(uz) on the circle |z| =1 in terms of the functions 
px(uz) /u*, which approximate to the functions z*, will yield of course a formal 
expansion of F(x) on the circle T in terms of the functions p;(x). The Taylor 
expansion of F(uz), a power series in z, transforms into a Taylor expansion 
of F(x), a power series in x. 
Our original inequality 


| — x*| S 


may be written 


| 


But the function (p,(x)/x*) —1 is analytic without exception for |x|<1+e, 
when properly defined for x«=0, and its greatest absolute value in that 
closed region is taken on for |x|=1+e. Thus we have 


s = 


€k 


[April 
| 
Site, 
€k 
| x*| 


1928] EXPANSIONS OF ANALYTIC FUNCTIONS 


Transformation to the z-plane gives the equivalent inequalities 


px(uz) €k 


(17) 


The right-hand member of (17) is not greater than « provided we restrict 
z as follows: 


1 
(18) |s| ifw>i, |z|S1t+e if 


The upper limits of z in (18) are both greater than unity, so (17) yields direct- 
ly (16), and Theorem VII is completely established. 

The proof of Theorem VII has not assumed any restriction on the quanti- 
ties ¢, beyond that of Theorem I. In fact, the condition that p,(x) should 
have at least a k-fold root at the origin is a very favorable one with reference 
to successive approximations and equivalence of expansions. With the 
conditions imposed, the requirements on the « of Theorem I can be con- 
siderably lightened; we do not, however, carry out the details here. 

We remark, too, that a result similar to Theorem VII is readily proved 
under the assumption that a convergent geometric series dominates the 
series > ¢,, without the assumption that p;(x) has at least a k-fold root at 
the origin; but here there is in general a lower limit greater than zero on the 
radius yp of the circle . We omit the proof of this remark. 

The following theorem is by no means the most general result of its 
kind that can be easily established: 


THEOREM VIII. If p:(x) has at least a k-fold root at the origin, and if the 
series >, ext* converges for every t, then the expansion of type (4) of any function 
F(x) analytic at the origin is unique. The functions P;(x) of Theorem I are 
analytic over the entire plane except at the origin. 


If F(x) is analytic on and within y, there cannot exist two distinct 
expansions of F(x) of the form 


(19) F(x) = Dicepi(x), F(x) = Dogepe(x) 


k=0 


both of which converge uniformly on y. For multiplication of these series 
through by P;(x) dx and integration term by term over y gives by (2) the 
equality of c, and gy. 


315 
| 
(1+6)* 
k=0 


316 J. L. WALSH [April 


We return to the more general situation of Theorem VIII. If two series 
of the form (19) both converge at even a single point for which |x|<1+.e, 
they converge uniformly on and within some circle I’ whose center is the 
origin. By the remark just made concerning uniqueness of expansions, and 
by the proof of Theorem VII, the two expansions are identical if they 
represent the same function on any circle whatever whose center is the origin. 
If the two series represent F(x) in a region lying interior to the circle 
|x| =1+€, they both represent that function throughout their entire regions 
of convergence interior to the circle |x |=1+-<. 

The analyticity of the functions P,(x) of Theorem I over the entire plane 
except at the origin follows from (9) and (11) as used in §2, with the new 
properties of the series }>«¢,. Compare also Theorem IXa, which does not 
use those new properties. Under the present hypothesis, then, the integrals 
which appear in (4) can be taken over any rectifiable Jordan curve which lies 
interior to y and in whose interior the origin lies, provided the function F(x) is 
analytic for |x|<1. The functions P;(x) which arise in Theorem I for the 
circle y, and the functions P;(x) which arise in the proof of Theorem VII 
by application of Theorem I to the transform in the z-plane of the circle T are 
identical; this can be verified by making the change of variable in all the 
formulas involved. 


B. SERIES OF POLYNOMIALS 


7. Application of results of A. We now apply Theorem I and the 
theorems which have just been proved in connection with it, deriving 
results as in I (p. 163 et seq.) for expansions of arbitrary functions in terms 
of polynomials. We choose the quantities « to satisfy the requirements of 
Theorem I and also so that >> «¢* converges for every ¢.* The polynomial 
px(z) is to be chosen so as to have a k-fold root at the origin; this choice is 
possible; compare I, p. 164, or Theorem X below. Then we have 


THEOREM IX. In the plane of the complex variable z let C be a simple 
closed finite analytic curvet which includes in its interior the origin. Then 


* See also the condition of I, p. 164, and its application in the proof of Theorem [Xa. 

t That is to say, a curve whose points can be put into one-to-one (regular-) analytic correspon- 
dence with the points of a circle. It is then a classical theorem in the study of conformal mapping that 
the region interior to C can be mapped on the interior of a circle so that the mapping is one-to-one 
and conformal in the closed regions considered, therefore one-to-one and conformal in larger regions 
including those closed regions in their interiors. See Picard, Traité d’ Analyse, II, Paris, 1893, pp. 272, 
276, or Bieberbach, Einfiihrung in die konforme Abbildung, Berlin, 1913, p. 120. 

From this theorem it follows at once, in the notation of I or of Theorem IX, that for points 2 
and z on C, the quotients —@(z2)) and —@(ze))/(z:—22) are uniformly bounded. 
Hence a function which satisfies a Lipschitz condition on C corresponds to a function which satisfies 
a Lipschitz condition on the circle y, and conversely. 


1928] EXPANSIONS OF ANALYTIC FUNCTIONS 317 


the interior of C can be mapped one-to-one and conformally on the interior of 
the unit circle y in the x-plane by some transformation x=¢(z), z=y(zx), 
where (0) =0, and the transformation will be one-to-one and conformal for 
the mapping of the closed interior of Ciz., an analytic Jordan curve in whose 
interior C lies, onto the closed interior of the circle |x|=1+¢, €>0. In general 
we denote by C, the transform of the circle |x| =p, where0<p<1i+e. 

Then there exists a set {px(z)} of polynomials in z and a set of functions 
{si(z)} analytic at every point of the extended plane except the origin, zero at 


infinity, and such that 
= 
1,i=k. 


If the function F(z) is analytic interior to C,, continuous in the corresponding 
closed region, and satisfies a Lipschitz condition on C,, then the series 


(20) Lisepe(z), | F(2)sx(¢)dz, 
k=0 Cp 

converges uniformly in this closed region to the value F(z). If F(z) is required 
merely to be analytic interior to C, and continuous in the corresponding closed 
region, then (20) converges uniformly to the value F(z) interior to an arbitrary 
curve C,, p’<p, and when summed by the method of Cesdro, (20) converges 
uniformly to the value F(z) in the closed region bounded by C,. 

If F(z) is an arbitrary function defined on C,, integrable and with an in- 
tegrable square, and if the condition* 


F(z)z*dz = 0 


Cop 


is satisfied, then the two series 


=f 


z|=p 


and (20) transformed by z= (x) have essentially the same convergence properties 
on and within the circle |x| =p, in the sense that their term-by-term difference 
converges absolutely and uniformly to the sum zero for |x| <p. 


* No condition is necessary here if we use 


o=f 
instead of (20). 


k=O 2 


318 J. L. WALSH 


An arbitrary series of the form 


(21) 

k=0 
which converges for a single point z on C,, converges uniformly interior to C,, 
if p’<p. If (21) diverges for a point z on C, that series diverges for all points z 
exterior to C, and interior to Ci,.. If in general we set 


lim sup | gx | = — 


then if p<1+€, series (21) converges for z interior to C, and diverges for z 
exterior to C, but interior to Ci+.; if p>1+€, series (21) converges for z on or 
interior to Ci4.. If 0<p<1+€, some singular point of the function represented 
by the series lies on the curve C,. 
If (21) converges for the value z =z, on C,, then this series converges uniformly 
in the closed region bounded by two arbitrary line segments terminating in 2, 
and by an arc of a curve C,, where p’<p. If (21) converges uniformly on an 
arc 222 of the curve C, then this series converges uniformly in the closed region 
bounded by that arc, by two arbitrary line segments whose interiors are interior 
to C, and which are terminated respectively by 2, and 2, and by an arc of the 
curve C,, where p’ <p. 
If (21) is such that limps ge(kp*)-'=0, and if for approach along the 
normal* to C, to the point z, on C, we have 
lim f(z) = g, 
where f(z) denotes the value of the (convergent for z interior to C,) series (21), 
then we have also 


D = 


k=0 


An arbitrary series (21), convergent for a single value of z interior to Ci. 
and not the origin, is the unique expansion of form (20) of some function F(z) 
analytic on and within some curve C,. 


The only part of this theorem not a direct result of our previous theorems 
is the fact that s,(z) is analytic over the entire z-plane except at the origin. 
This should give the reader no difficulty, using Theorem VIII and the method 
of I, p. 165; compare also §9. 


* A more general theorem might easily be announced; see the footnote to Theorem VI. Here we 
do not apply Theorem VI directly, but the more general theorem suggested in connection with 
Theorem VI. 


[April 


1928] EXPANSIONS OF ANALYTIC FUNCTIONS 319 


It will be noticed that Theorem IX does not mention convergence of 
the series (20) or (21) outside of C,,... The reason for this omission will 
become clearer after we have proved a general theorem on approximation. 

8. A general theorem on approximation. We prove a much more gen- 
eral theorem than necessary for our immediate purposes:* 


THEOREM X. [f the function f(z), defined on the bounded point set S, can 
be approximated on that point set as closely as desired by a polynomial in z, 
and if there be given any p points 2, 22, + - - , Zp of S together with an arbitrary 
e>0, then there exists a polynomial p(z) such that 


| p(z) — f(zs)| Se, zonS, 
p(zi) = f(z.) (4=1,2,---, p). 


We prove Theorem X by means of Lagrange’s Interpolation Formula, 
and find it convenient first to prove the following 


and 


Lemma. [f 21, 22, , Zp, R are considered fixed, if we have 
Sn (k=1,2,---, p), 
and if G(z) denotes the polynomial defined by Lagrange’s Interpolation Formula 


which takes on the values G;, in the points 2, k=1,2,---, p, then there exists 
‘a constant M independent of n so that we have 


| G(z) | <= Mn for ali z, | 
For simplicity we take R so large that |z,|<R,k=1,2,---, p. The 
Lagrange Formula is 
Z1) °° (3 — 2 1)(3 — (2 — Zp) 


Gis) = YG, 


so the Lemma is obvious if we merely set 
(2R)?- 


* It seems inconceivable that this theorem is not already in the literature, but the writer has 
been unable to find a reference to it. The corresponding theorem for approximation by trigonometric 
polynomials is given by D. Jackson, Bulletin of the American Mathematical Society, vol. 32 (1926), 
pp. 259-262. 

Theorem X holds of course for approximation of real functions by means of real polynomials, 
and can be extended (1) by requiring the agreement of certain derivatives of the approximating 
polynomial with the corresponding derivatives of the given function, (2) by assigning as the values of 
the polynomial (and derivatives) not the exact but values near to the values of the function (and 
derivatives), (3) by noticing that for points 2; off S arbitrary values f(z,) may be assigned. 


320 J. L. WALSH [April 


Theorem X follows easily now. Choose R so large that all points of S 
lie in the circle |z |<. Choose a polynomial g(z) (which exists by hypothesis) 


so that we have 
€ 


| g(z) — f(z)| zonS. 


For the polynomial G(z) we assign the values 
G(zx) = q(ze) — (fe) 
so that we have by the Lemma 


M 
|G(z)| < 


Then the polynomial 
p(z) = g(z) — 


satisfies all the requirements of Theorem X. 

The polynomials ,(z) of Theorem IX are polynomials which uniformly 
approximate to the functions [¢(z) ]* respectively on and within Ci,.. By 
a classical theorem due to Runge, these polynomials may be subjected to 
the auxiliary condition of uniformly approximating other analytic functions 
—let us say constants—in arbitrary non-intersecting closed Jordan regions 
outside of C,,.. In particular we may by Theorem X require that the 
polynomials /,(z) shall actually take on arbitrarily preassigned values at 
an arbitrary number of points exterior to C:,.. Thus we may choose (1) the 
value zero for the points 21, 22, - - - , Zp (independent of &), in which case all 
series of the form (21) converge at those points, or we may choose (2) 


px(zi) = (¢=1,2,-°:, 2; k=1,2,3,---), 
in which case no series (21) not convergent throughout the interior of C,, 
converges at the points z;. It is because of this difference in behavior that 


Theorem IX omits mention of the convergence or divergence of (21) outside 
of Core 

9. Further properties of expansions. One interesting property of series 
(20) has not yet been mentioned, which brings out still more clearly the 
analogy with Taylor’s series: 


THEOREM IXa. The coefficients a, in (20) can be written in the form 
a, = AgF(0) + + + A,F(0), 


where A; is a constant independent of F(z), and where F‘ (0) indicates the ith 
derivative of F(z) at the origin. 


1928] EXPANSIONS OF ANALYTIC FUNCTIONS 


Differentiation of (20), with insertion of the value z=0, yields 


F(0) = aopo(0), 
F’(0) = (0), 
F’'(0) = + 
= aipi’’ (0) + (0) + asps’’(0), 


Here we use the fact that f(z) is constant, and that #;,(z) has at least a 
k-fold root at the origin and hence (I, p. 164), having precisely k roots 
interior to Ci,., has precisely a k-fold root at the origin. This system of 
equations is therefore such that p;,‘* (0) is always different from zero, 
k=0, 1, 2, - - -, and hence the system can be solved for the coefficients a, 
linearly in terms of the F‘*(0). 

As an application of Theorem [Xa, it may be noticed that s,(z) can be 


written in the form 


(k) 
B® B® B 
So(2) = Su(2) = —— + —— + --- b> 0; 


ght 
this follows directly from the integral formula for the derivatives of F (z). 
One may consider in some detail the expansion of an arbitrary function 
@(z), analytic on and interior to C, in terms not of the polynomials ?;(z) 
but in terms of their derivatives p;/ (z). Let F(z) be any integral of (z), 
so that we have for z on and interior to C, 


F(s) = aopo(s) + + = f F(z)ss(2)dz, 
Cc 
F'(z) = &(z) = (2) + + 


The term aop¢ (z) is here omitted, for fo(z) is a constant. 
The integral 


ff suas, k>0O, 
c 


is equal to zero, for this integral may be written 


1 
po(2) J s@ peas 


known to vanish by Theorem IX. Hence the indefinite integral o:(z) of s,(z) 
is single-valued in and on C. 


321 


322 J. L. WALSH 


Let us integrate 
a, = &> 0, 
Cc 


by parts, fudv =uv—fvdu, setting u=F(z), dv=s,(z)dz. We find 
a, = — f 
Cc 


As is to be expected, we find also by partial integration 


Ck i(z)dz = — k i dz = — biz. 
(s)p! (2) J: (=) pils)ds 


That is, there exists a set of functions {—o;(z)} such that the two sets 
{pd (z)} and {—ox(z)} are biorthogonal. An arbitrary function 4(z) 
analytic on and within C can be expanded in the series 


@(z) = aipi (zs) + +--+, f 


which converges uniformly on and within C. 

Both this remark on the derived functions and Theorem [Xa can be 
applied in the x-plane to the series (4) under the hypothesis of Theorem VIII. 

10. Expansion of discontinuous functions. There are considered in I not 
merely series such as (20), but likewise series in polynomials q;(z) in the 
reciprocal of z.* These series are used in I to expand arbitrary functions 
satisfying a Lipschitz condition on C. In order to study the expansion of 
discontinuous functions in such series, we investigate the function or functions 
represented by Cauchy’s Integral 


f(idt 


ct—2 


where the given function f(é) is discontinuous. We shall suppose C to be 
the same curve previously considered, although the discussion holds under 
much broader conditions. 

A particularly simple kind of discontinuity, that of a finite jump, is 
typified by the function f(¢)=log ¢, where we choose as that branch of the 


* We notice that the argument used in I, p. 166, to prove bb\=0 can be somewhat shortened. We 
choose, in fact, go(z) =1, =O for k>0. Then in the expansion of f2(z): 


fule) = 


it is obvious that b)>=0 when f2(z)=0 for z= ~. 


[Apri! 
f 
| 


1928] EXPANSIONS OF ANALYTIC FUNCTIONS 323 


function log ¢ the branch which is real and positive for the smallest real 
positive value of ¢ on C, say to. Consider in general the plane cut along the 
line Ofo, and from ¢, to infinity along a curve exterior to C. The determination 
of the branch of log ¢ considered is then made by the use of continuity in 
the cut plane. The function f(é) is continuous on C except for a finite jump 


at ¢) of magnitude 277. 
1 log ¢ 
2riJdc t—z 


We evaluate the integral 
by partial integration, fu dv=uv—/fv du, setting u=log t, dv=dt/(t—z). 
We find 
1 1 log (¢ — 2) 
F(z) = — — | ———dl. 
Cc Cc t 

The first term in the right-hand member has the value 27i+log t)+log 
(ts—z), or log (ts—z), according as z lies interior or exterior to C. The 
proper determination of log (¢—z) is to be found by continuity, moving ¢ 
along C until it coincides with ¢, then by moving z, not crossing the cut, 
until z coincides with the origin. The second term in the right-hand member 
has a zero derivative with respect to z, if z lies interior to C, as is seen by 
direct computation. The value of the integral, z interior to C, is therefore a 
constant equal to the value for z=0: 


1 log ¢ 


1 
dt = — —-log?t] = — mi — log f%o. 
Cc 4 


t rt Cc 


The value of this same integral 


when z lies exterior to C is, by Cauchy’s Formula, —log (—z). We have 
finally, therefore, 


fi(z) = F(z) = wi + log (to — 2), interior toC, 
fo(z) = — F(z) = — wi — log (to — 2) + logz, 2 exterior toC. 
As a check we have f(z) =f:(z) +fe(z) when z lies on C (except in case z=%, 


where the functions f,(z) and f2(z) are, strictly speaking, not defined), as 
we should have by the results of Plemelj.* The function f,(z) is analytic on 


*See I, p. 167. The validity of the equation f(z)=fi(z)+-fe(z) is dependent, provided f(z) 
satisfies certain large conditions of integrability, merely on the behavior of the function f(z) in the 
neighborhood of the point z considered; the satisfaction of a Lipschitz condition in such a neighbor- 
hood is sufficient for the validity of the equation. 


© 

1 log (t — 
Cc t 


324 J. L. WALSH [April 


and interior to C except at the single point #; the function f2(z) is analytic 
on and exterior to C except at ¢) and vanishes at infinity; both functions 
are integrable and have an integrable square on C. We notice too by direct 
computation 


Cc 


from which follow the formulas (notation of I) for the coefficients in the 
expansion of f,(#) and f2(t): 


f (b= 1,2,3, 
(22) 


We use here in proving (22) the fact that ¢,(¢) is analytic on and within C, 
hence on C can be expressed as a uniformly convergent series of polynomials 
in ¢; likewise s,(¢) is analytic on and exterior to C, hence on C can be expressed 
as a uniformly convergent series of polynomials in 1/t each without constant 
term. Such series may be integrated term by term, even after multiplication 
by fi(z) or fo(z). 

The development of f,(z) on C in terms of the polynomials p,(z) has 
essentially the same convergence properties as the development of the 
function 

wi + log [to — ¥(x)] 
in a Fourier series (which is precisely the same as the development of the 
function in a Taylor or Laurent series) on the unit circle y in the x-plane. 
The development of f2(z) on C in the polynomials q;,(z) has essentially the 
same convergence properties as the development of 


— wi — log [to — ¥i(x)] + log y¥i(x) 


in a Fourier (or Laurent) series on y, where we may take the solutions x =x» 
of the two equations 


V(x) = to, Yilx) = to 


equal,* ¥,(x) being a mapping function for the exterior of y onto the exterior 
of C, with correspondence of the points at infinity. These two functions 


* Rotation of axes does not alter the convergence properties of a Taylor or Fourier development. 


1928] EXPANSIONS OF ANALYTIC FUNCTIONS 325 


just considered in the x-plane are both integrable with an integrable square 
and on ¥ possess continuous derivatives except at the point x». The develop- 
ments of the two functions converge therefore to the values of the respective 
functions except at xo, and uniformly except in the neighborhood of 2¢. 
In the neighborhood of the point xo the term-by-term sum of the two 
developments converges like the Fourier development of 
to — ¥(x) 
log + log ¥:(x). 

The latter term contains the only discontinuity, a finite jump of magnitude 
2mi. Gibbs’s phenomenon therefore occurs in its characteristic form at this 
point xo; the series converges to the value which is the arithmetic mean 
of the limits approached in the two directions on y at x» by the function 
developed. 

The Fourier development of f2(#) transformed by either ¢=y(x) or 
t=y,(x) but interpreted for the same values of ¢ has essentially the same 
convergence properties in the two cases. 

The discussion we have given is not essentially dependent on the par- 
ticular choice of tp made originally. We may therefore state 


THeEorEM XI. If the function F(z) satisfies a Lipschitz condition on C, and if 
the function 
f(z) = F(z) + kilogz + k2logz+---+ knlogz, 
where each term k; log 2 is continuous on C except at a single point 2; of C, 
zi (tk), kixX0,—be developed in a series (2) as in I (p. 156), then the 


Fourier development of f(z) on the unit circle |x|=1, where z=y(x), and the 
series 


(23) f(z) = aopo(z) + [arpi(z) + bigi(z)] + fa2pe(z) + + --- 


have essentially the same convergence properties* on C. In particular (23) 
exhibits Gibbs’s phenomenon at the points 2 precisely as does a Fourier series. 
On any closed arc of C containing no point 2, the series 


(24) dopo(z) + aipi(z) + a2p2(z) + --- 


* The statement that two series have the same convergence properties is used in two senses in the 
literature, to indicate (1) that their term-by-term difference converges uniformly to the sum zero, 
or (2) that their term-by-term difference converges absolutely and uniformly to the sum zero. 
The present writer has hitherto consistently used the second of these two meanings, but in the 
present case implies (1) instead of (2). The treatment given here considers uniform convergence 
but not absolute convergence. 


326 J. L. WALSH 


converges to the value f;(z) and the series 
(25) bigqi(z) + bego(z) + 


converges uniformly to the value f2(z), where 


1 
fi(s) = — 


2ri 
is analytic interior to C and continuous in the corresponding closed region 
except at the points z,, and 


c 


is analytic exterior to C, vanishes at infinity, and is continuous in the corres- 
ponding closed region except at the points z. For z=, the two series (24) 
and (25) diverge with infinite sum, whereas the series (23) converges and its sum 
is the arithmetic mean of the two limits approached by f(z) as z moves in opposite 
senses on C and approaches z. If an arbitrary neighborhood of each of the 
points 2, is cut out of the closed interior of C, the series (24) converges uniformly 
to the value f,(z) in the remaining closed region. If an arbitrary neighborhood 
of each of the points 2, 1s cut out of the closed exterior of C, the series (25) converges 
uniformly to the value f2(z) in the remaining closed region. 


Theorem XI is proved under the hypothesis on the ¢; that >> e¢* converges 


for every ?. 
Actual formulas for f;(z) and f2(z) in terms of logarithms and the functions 


represented by the integral 


1 f F(t)dt 
can easily be written down. Of course, any function which is smooth except 


for a finite number of finite jumps can be put into the form of f(z) of this 


theorem. 

The conclusion of Theorem XI naturally holds for the formal Laurent 
development of a discontinuous function of the kind considered, if the 
curve C is a circle. In particular it is the divergence of the Taylor series 
for log («—a) for x=a that enables us to conclude the divergence of (24) 


and (25) for z=2z,. 
C. BOUNDARY VALUES OF AN ANALYTIC FUNCTION 


11. A condition for analyticity. We now take up the study of the 


[April 
f(t)dt 
Koa 
fie) 


1928] EXPANSIONS OF ANALYTIC FUNCTIONS 327 


boundary values of an analytic function, later for a multiply-connected region 
but first for a simply-connected region :* 


THEOREM XII. If the function f(z) is continuous on the analytic Jordan 
curve C, and if we have 


(26) f = 0 (n = 0,1,2, vee), 
c 


then there exists a function F(z) analytic interior to C, continuous in the closed 
region which consists of C and its interior, and which coincides with f(z) on C. 


If the curve C is the unit circle |z| =1, the theorem is surely true. In fact 
the formal Laurent development of f(z) is of the form of a Taylor series, since 
by (26) the coefficients of the negative powers of z vanish: 

f(z) ~ ado + aiz + +--+, = —- 

c 
This development is precisely the formal Fourier development of f(z) for 
0<@<27m if we set z=e*. The Fourier development, when summed by 
the method of Cesaro, converges uniformly on C, since f(z) is continuous. 
Each term of the corresponding sequence is analytic on and interior to C, 
hence the sum of the series is analytic interior to C, continuous in the cor- 
responding closed region, and is equal to f(z) on C. This completes the 
proof of the theorem when C is the unit circle. 

If C is not the unit circle, we map the interior of C onto the interior of 
the unit circle y in the x-plane, the transformation being as usual! x =¢(z), 
z=y(x). The function [¢(z)]"¢’(z) is analytic in and on C, hence on C 
can be (by Runge’s theorem) uniformly expanded in a series of polynomials 
in 2: 

= mo(z) + + m2 0. 


This series converges uniformly on C even after multiplication term by term 
by the continuous function f(z). Term-by-term integration of the new series 
thus formed yields, by virtue of (26), 


=0 


* In connection with this problem and the conditions derived, see F. and M. Riesz, Comptes 
Rendus du Congrés (1916) des Mathématiciens Scandinaves, Uppsala, 1920, pp. 27-44; Privaloff, 
L’Intégrale de Cauchy, Saratow, 1919; Kakeya, Téhoku Mathematical Journal, vol. 5 (1914), pp. 
40-44, as well as the references given in I, p. 167. 


328 J. L. WALSH [April 


We have, then, 


= 0 (mn = 0,1,2, see), 


so by the special case of the theorem already proved there exists a function 
analytic interior to , continuous in the corresponding closed region, and 
coinciding on y with the function f[y(x)]. Transformation by the formula 
x =¢(z) gives the required function in the z-plane. 

Conditions (26), it may be remarked, are all independent of each other 
and none of them may be omitted. If all of those conditions except a finite 
number are satisfied, then there exists a function F(z) and a polynomial 
P(z) in 1/z such that F(z) is analytic interior to C, continuous in the cor- 
responding closed region, and such that 


F(z) + P(z) = f(z), zonC. 
The polynomial P(z) is uniquely determined if we require that it shall vanish 
at infinity; otherwise is uniquely determined only to within an additive 


constant. 
An alternate statement for Theorem XII is 


THEOREM XIIa. If the function f(z) is continuous on the analytic Jordan 
curve C, and if we have 


(27) = 0, 


for every function w(z) analytic in the closed region* interior to C, then there 
exists a function F(z) analytic interior to C, continuous in the closed region 
which consists of C and its interior, and which coincides with f(z) on C. 


The equivalence of (26) and (27) is easy to show. If (27) holds, (26) is 
surely satisfied. If (26) holds, then an arbitrary function w(z) of the kind 
considered in Theorem XIIa can be uniformly expanded, in the closed 
region consisting of C and its interior, in a series of polynomials: 

w(z) = mo(z) + mi(z) + me(z) +---. 
Term-by-term multiplication of this series by f(z) and term-by-term inte- 
gration yield, by means of (26), equation (27). Thus (27) is both a necessary 
and a sufficient condition for the existence of F(z). 

There is a similar statement for functions representing the boundary 
values of a function analytic at infinity: 


* We have here an equivalent condition if w(z) is required to be analytic merely interior to C 
and continuous in the corresponding closed region. 


1928] EXPANSIONS OF ANALYTIC FUNCTIONS 329 


THEOREM XIII. Let the function f(z) be continuous on the analytic Jordan 
curve C, in whose interior the origin lies; then the two equivalent conditions 


(B) fl2)w(2)ds = 0, 


for every function w(z) analytic exterior to C (also at infinity), continuous in 
the corresponding closed region, and zero at infinity—these two conditions are 
each necessary and sufficient that there should exist a function F(z) zero at in- 
finity, analytic exterior to C (including the point at infinity), continuous in the 
corresponding closed region, and equal to f(z) on C. 


The proof of this theorem is easy and will be omitted. 

If in condition (A) we omit k= —1, and in (B) require that w(z) should 
have a double root at infinity, those two conditions remain equivalent. The 
conditions are then necessary and sufficient for the existence of F(z), analytic 
exterior to C (including the point at infinity), continuous in the corresponding 
closed region, and equal to f(z) on C. We cannot say, however, that F(z) 
vanishes at infinity. 

It will be noticed that if the continuous function f(z) satisfies (26) as 
well as (A) of Theorem XIII, the two functions defined interior and exterior 
to C respectively are analytic in the neighborhood of C, hence analytic also 
on C. The function analytic exterior to C vanishes at infinity, so f(z) is 
identically zero. 

12. Extension to multiply-connected regions. In extending Theorems 
XII and XIII to the case of regions bounded by several contours, we shall 
mention merely the analogue of condition (A) although the analogue of 
condition (B) is easily included. 


THEOREM XIV. [If the analytic Jordan curve C’ lies interior to the analytic 
Jordan curve C, if the origin lies interior to C’, and if the functions f,(z) and 
f2(z) continuous on C and C’ respectively satisfy the conditions 


(28) J = J (k=---—2, —1,0,1,2,---), 


then there exists a function F(z) analytic in the annular region bounded by C 
and C’, continuous in the corresponding closed region, and which on C and C’ 
coincides with f,(z) and f2(z) respectively. 


It is sufficient to establish Theorem XIV in the case that C isa circle. For 
if the theorem is true in that case, we shall prove it to be true in the general 


330 J. L. WALSH [April 


case. Let z=y(x), x=¢(z) denote as usual the functions which map the 
interior of C onto the interior of the unit circle y in the x-plane. Since 
(0) =0, the curve C’ corresponds to an analytic Jordan curve y’ in whose 
interior the origin x =0 lies. 

Conditions (28) lead to the equations 


(29) f fils) [6(2) = f fale) [6(2) 
c Cc’ 


For the function [¢(z)]* ¢’(z) is analytic in the closed region bounded by 
C and C’, hence in that closed region can be uniformly expanded in a series 
of polynomials in z and 1/z.* This expansion can be integrated term by 
term on C or C’, after multiplication through by /f,(z) or f2(z). Computation 
of the two members of (29) by the use of this series makes their equality 
evident in the light of (28). 

Equations (29) are precisely the equations 


ff = ff (k=---—2, —1,0,1,2,---), 


sufficient for the existence of a function f(x) analytic in the annular region 
bounded by vy and 7’, continuous in the corresponding closed region, and 
equal to f:[y(x)] and f2[y(x)] on y and y’; this is sufficient for the existence 
of the function F(z) of the theorem. 

It remains, then, to prove Theorem XIV when C is a circle. Consider 
the formal Taylor development of the function f(z): 


1 Silz) 


filz) ~ ao + + + --- 
Cc 
This series converges interior to C, defining a function F(z) analytic interior 
to C, and the series converges uniformly on any curve [ interior toC. Thus 
if T is an arbitrary rectifiable Jordan curve interior to C and includes in 
its interior the origin, we have 


= J porta = f (k= 1,2,3,---). 


Hence the function f2(z) = F(z) is continuous on C’ and satisfies the conditions 


* This is very easy to prove by writing the function involved as the sum of two functions, given 
by Cauchy’s integral taken over C and C’ respectively. The one function is analytic on and interior 
to C, the other on and exterior to C’. 


EXPANSIONS OF ANALYTIC FUNCTIONS 


— Fi(s)|s*ds = — = 0 
[ue (2) J po 2 


(k=1, 2,3,---). 


By Theorem XIII there exists a function F2(z) which is analytic exterior 
to C’, continuous in the corresponding closed region, and which on C’ 
coincides with f2(z)—Fi(z). The function F.(z)—fi(z) is continuous on C. 
From the relations 


ff ff fiestas (k = 0,1,2,---), 
c’ c 


we deduce by Theorem XII the existence of a function ®(z) analytic interior 
to C, continuous in the corresponding closed region, and coinciding on C 
with F, (z) —fi (z) 


(30) = — fi(z), zonc. 


We have, however, 


J = [ee — &(z) |z*dz = J 
(k= —1, —2, —3,-->-), 


so that the two functions — (z) and F(z) have the same coefficients in 
their Taylor development about the origin and are therefore identical. 
The function F.(z) — ®(z) is then analytic interior to the annular region 
bounded by C and C’, continuous in the corresponding closed region, by (30) 
equals fi(z) on C, and by the definition of F(z) equals f2(z) on C’. 
Theorem XIV, whose proof is now complete, can be extended to regions 
of higher connectivity: 


THEOREM XV. Let R be the region bounded by an analytic Jordan curve 
Co and by non-intersecting analytic Jordan curves Ci, C2,---, Cn lying 
interior to Co. If the function f(z) is continuous on C, the complete boundary of 
R, then a necessary and sufficient condition that there exist a function F(z) 
analytic in R, continuous in the corresponding closed region, and equal to f(z) 
on C, is 


1928] 331 


sestas = 0 (k =0,1,2,---), 

(31) 
f fore - = 0 (4=1,2,---,m; k=1,2,3,---), 

c 


where 2; is an arbitrary fixed point interior to C;. The integrals in (31) are 
to be taken over the complete boundary of R, in the positive sense on that boundary. 


The proof of Theorem XV is similar to the proof of Theorem XIV 
and is omitted. Theorem XV remains true if the Jordan curves C; are no 
~ longer required to be analytic, provided they are regular in the sense of 
Osgood,* and provided the function f(z) satisfies a Lipschitz condition on C. 
The proof of this new theorem is likewise fairly simple, as an application 
of the theorem of Plemelj used in I, p. 167. It will be noticed too that in 
the proofs of the theorems given we need not require that the curves used 
be analytic; it is sufficient if the derivative ¢’(z) of the mapping function 
used in each case is continuous in the closed region which we map. 

The existence of other theorems which lie not far away is obvious; we 
give a single example related to Theorem XIV: 


THEOREM XVI. Let the analytic Jordan curve C’ lie interior to the analytic 
Jordan curve C, let the origin lie interior to C’, and let the functions f,(z) and 
f2(z) be defined and continuous on C and C’ respectively. Then a necessary and 
sufficient condition for the existence of a function F(z) analytic interior to C, 
continuous in the corresponding closed region, and coinciding on C and C’ 
with f,(z) and fo(z) respectively, is . 


ff = ff = 0 (k=0,1,2,---), 
c c’ 


ff = ff (k = —1,-—2,-3,---). 
c c’ 


Theorems XII-XVI have obvious application to expansions in terms 
of polynomials, particularly in connection with such theorems as IX, which 
do not demand analyticity for the development of a given function. 


* Funktionentheorie, Leipzig, 1912, p. 51. 


HARVARD UNIVERSITY, 
CAMBRIDGE, MAss. 


332 J. L. WALSH 


PRIMITIVE GROUPS WHICH CONTAIN SUBSTITUTIONS 
OF PRIME ORDER p AND OF DEGREE 6p OR 7p* 


BY 
MARIE J. WEISS 


1. In a memoir on primitive groups in the first volume of the Bulletin 
of the Mathematical Society of Francet Jordan announced the following 
theorem: 

Let q be a positive integer < 6, p any prime >gq. The degree of a primitive 
group G that contains a substitution of order p on q cycles (without including 
the alternating group) cannot exceed pqg+q+1. 


He gave proofs for the cases g=1, and g=2, but no proofs for greater 
values of g. In the same memoir, he also found a limit for the degree of G 
when gq is not restricted to numbers <6. His results may be stated as follows: 


Let q be any positive integer, p any prime>2qlogeqg+qt+1. The degree 
of a primitive group G that contains a substitution of order p on q cycles (without 
including the alternating group) cannot exceed gp+2q log: 2g. 


Manning has studied this problem further. He not only gave proofs 
for the cases g=2, 3, 4, 5, finding a somewhat closer limit than the one 
announced by Jordan, but also found a much lower limit for the degree 
of G in the general case. Using the theory developed in the proof of his 
general theorem, he investigated the case g=6. A brief statement of these 
theorems follows: 

Let q be any integer greater than unity and<5, p any prime>q+1. Then 
the degree of a primitive group of class>3 which contains a substitution of 
order p and of degree gp cannot exceed qp+q. When poqtl, the degree 
cannot exceed qgp+q-+1. 

If a primitive group of class>3 contains a substitution of prime order 
p on 5 cycles, p>5, its degree cannot exceed 5p+-6. Moreover, if a primitive 
group of degree 5p+-6 exists, it is doubly transitive. 

The degree of a primitive group G of class>3 which contains a substitution 
of prime order p and of degree gp(p>2q—3, g>1), does not exceed gp+4q—4. 
Moreover p* does not divide the order of G. 

* Presented to the Society, San Francisco Section, October 29, 1927; received by the editors 


September 1, 1927. 
t C. Jordan, Bulletin de la Société Mathématique de France, vol. 1 (1873), pp. 175-221. 


333 


334 M. J. WEISS [April 


The degree of a primitive group of class>3 which contains a substitution 
of prime order p and of degree 6p(p>6) does not exceed 6p +-10. 


He published these results in a series of 4 papers, entitled On the order of 
primitive groups.* This title draws attention to the fact that the problem of 
finding a limit for the degree of these primitive groups is equivalent to the 
problem of finding a limit for the order of a primitive group in terms of its 
degree. The fact that the order of these groups is limited by their degree is 
discussed by Manning in his second paper. 

2. In the present paper, the case g=7 will be investigated and the limit 
for the degree of G for the case g=6 will be lowered. The following theorems 
will be proved: 


The degree of a primitive group G of class>3 which contains a substitution 
of prime order p(p>7) and of degree 6p cannot exceed 6p+6. If p=7, the 
true limit for the degree of G is 6p+7. Moreover, if G of degree 6p+7 exists, 
it is doubly transitive. 

The degree of a primitive group G of class>3 which contains a substitution 
of prime order p and of degree 7p, p>7, does not exceed 7p+8. Moreover, if 
G of degree 7p+8 exists, it is doubly transitive. 


Although the first theorem is not proved until §17, we shall use it in the 
proof of the second theorem, for the former depends in no way upon the latter. 
The proof of the latter theorem is based on the general method developed 
by Manning in his third paper On the order of primitive groups, of which es- 
pecially §§9-20 should be carefully read before reading the following 
proof. All definitions and the fundamental theory will be found in these 
sections. The assumption (III, §12) that the degree of H,4: exceeds gp+q 
should be noted, for with the exception of §15 this hypothesis is held through- 
out the present paper. We now proceed to the proof of this theorem. 

3. If H,4: is imprimitive, H,,, (III, §18) has systems of imprimitivity 
of 7 letters only, for the number of letters in a system must divide 7 (III, 
§17). Moreover, if H,4: is of degree>qgp+gq, the systems of imprimitivity 
of H,,, are permuted according to a primitive group which is not triply 
transitive (III, §18), thus according to a primitive group of degree p+1 at 
most. Then the degree of H,,, does not exceed 7p+7. Now since a generator 
introduces 7 letters or none, s=1. We shall now consider H,4, of degree 
7p+7. Clearly the order of H,4; is not divisible by p? when p>7. Then J; 


* W. A. Manning, these Transactions, vol. 10 (1909), p. 247; vol. 16 (1915), p. 139; vol. 19 
(1918), p. 127; vol. 20 (1919), p. 66. These 4 papers will be referred to by the Roman numerals I. 
II, III, IV, respectively. 


1928] PRIMITIVE GROUPS 335 


is transitive of degree 7. Now H,,; may be contained in a doubly transitive 
group of degree 7p+8, for if H,+2 exists, J: is multiply transitive of degree 8. 
Since the J group is always a transitive: representation of a subgroup or 
quotient group of the direct product of a cyclic group of degree a divisor of 
p—1 and the symmetric group of degree 7 (see §5), J2 must be the simple 
group of order 168, for it is the only primitive group of degree 8 which does 
not occur for the first time on 8 letters. A triply transitive group of degree 
7p+9 does not exist, for then J; is of degree 9. However, J2 is not contained 
in any non-alternating primitive group of higher degree, and J is not alter- 
nating when its degree exceeds 7 (III, §20). 

4. Let H,4: be a primitive group. Its subgroup F is intransitive. If 
p>2q—3, no constituent of F is alternating, nor does an imprimitive 
constituent permute its systems according to an alternating group (III, §35).* 

We shall now assume p>2¢—3. The case of p=2q—3 will be taken up in 
§16. Then the degree of F does not exceed 7p+14. The order of F is not 
divisible by p? (III, §27), nor is the order of the subgroup L of H,,; that leaves 
one letter fixed. 

5. The group J, that occurs in H,,; has no substitutions on the letters 
of J; only, for then G would be of class <2¢—4 (III, §21) and therefore of 
degree 

We shall need the following theorems (I, Theorems 5 and 6, p. 251) in 
discussing the J groups. 

Let P be a cyclic group of prime order p and of degree qp(q<p). The largest 
group G on the same letters that transforms P into itself and that contains no 
substitution of order p with <q cycles is of order p(p—1)(q!). 

The quotient group G/P is the direct product of a cyclic group of order p—1 
and a group isomorphic to the symmetric group of degree q. 

Then the constituent of J; on the letters of A; is the group of order 
p(p—1)(q!) or a subgroup of it. Thus J; is a transitive representation of the 
direct product of a cyclic group of order d (d a divisor of p—1) and a group 


* An accurate statement of this theorem follows: If p>2q—3, the degree of H,,; does not exceed 
qgp+q, when H;; has a transitive constituent which is alternating or which permutes systems of 
imprimitivity according to an alternating group. If =2q—3, there is one exception to this limit of 
the degree of H,4:. It occurs when E, (III, § 30) has a transitive constituent simply isomorphic 
to the alternating constituent of degree p. Then E, has three constituents of degrees (p—1)/2, 9, p. 
The third constituent cannot be of degree +, for such a constituent is of too small a degree to be 
simply isomorphic to the alternating constituent of degree p and (III, §33) EZ; cannot have a primitive 
constituent multiply isomorphic to the alternating constituent of degree p. Then H,,; is of degree 

7 W. A. Manning, American Journal of Mathematics, vol. 28 (1906), p. 226. 


336 M. J. WEISS [April 


K which occurs as a quotient group among the groups of degree<7. Let 
the group K of order kk’ be multiplied into a cyclic group of order d. The 
direct product of order kk’d can be represented as a transitive group on dk 
letters if and only if the group K of order kk’ has a subgroup K’ of order k’ 
which contains no invariant subgroup of K. Call the subgroup of J; that 
leaves one letter fixed J/. It should be noted that J/ is not the identity 
if H,4: is of degree>qp+g. If the subgroup XK’ is invariant in a group of 
order mk’, J{ , isomorphic to K’, is invariant in a group of order mk’ and there- 
fore fixes exactly m letters. 

6. The following theorems on simply transitive primitive groups by 
Manning will be used repeatedly: 


THEOREM 1. Let Gi, the subgroup that fixes one letter of a simply transitive 
primitive group G of degree n and order g, have a multiply transitive constituent 
of degree m. If G; has no transitive constituent whose degree (>m) is a divisor 
of m(m—1), all the transitive constituents of G, are simply isomorphic multiply 
transitive groups of degree m and order g/n.* 


THEOREM 2. If only one transitive constituent of G: is an imprimitive group 
(of order f), Gi is of order f. 


THEOREM 3. If G, has an intransitive constituent of order f, and if all the 
transitive constituents on the remaining letters of G, are primitive groups, Gi 


is of order 


THEOREM 4. Let G; have a primitive constituent M of degree m, in which 
the subgroup M, that fixes one letter is primitive. Let M be paired with itself 
in G; and let the order of M be<g/n. Then G; contains an imprimitive constitu- 
ent in which there is an invariant intransitive subgroup with m transitive 
constituents of m—1 letters each, permuted according to the permutations of the 
primitive group M.t 


7. We wish to see how far the reasoning used in the proof of Theorem 1 
may be applied to the subgroup F of H,4; when H,,: is a primitive group of 
degree >qp+q (¢>5, p>2q—3). Let L(x) (=L, III, §21) be the subgroup 
of H,,; that fixes the letter x. Then L(x) has an invariant subgroup F(x) =F 
generated by all of its substitutions of order p. Let F(x) have a transitive 
constituent of degree +2 on the letters a, d2,--- , @p42. If the order of 
F(x) is t, the order of F(x)(a:) is ¢/(p+2) and the order of F(x)(a:) (a2) 


* W. A. Manning, Primitive Groups, 1921, p. 83. 
+ W. A. Manning, Proceedings of the National Academy of Sciences, vol. 12 (1926), p. 755. 
t W. A. Manning, these Transactions, vol. 29 (1927) pp. 815-825. 


1928] PRIMITIVE GROUPS 337 


is t/[(p+2)(p+1)]. In F(a:), x belongs to a transitive constituent of degree 
p+2. Since F(a:)(x) has a transitive constituent on the letters 
Gz, G3,°~**, Gp42, the transitive constituent to which x belongs in F(a) 
contains either +1 a@’s or none. Suppose that F(a;) has a transitive constitu- 
ent on the letters x, a2, a3, - - , @,42. Then the group {F(a:), F(x)} hasa 
transitive constituent of degree on the letters x, a1, - - - , Now 
a group of degree +3 that contains a substitution of degree and order p 
is alternating. The subgroup of { F(a:), F(x)} that fixes the letter x contains 
an invariant subgroup, F(x), generated by all of its substitutions of order p 
and has an alternating constituent on the letters a1, a2, - - - , dp42. Since an 
alternating group of degree >4 is simple, F(x) has an alternating constituent 
on the letters dz, , @p42. However, if p>2q—3, F has no alternating 
constituents. If F(a,) has a transitive constituent on the letters ae, ds, 

- , Gp42, F(x) also has a transitive constituent of degree p+1. But it 
can be shown that F cannot have constituents of degree +2 and p+1 at 
the same time. The constituent groups of F are positive groups. In order that 
the constituent of degree +2 contain no negative substitutions, the substi- 
tution from its J group that is associated* with the substitution from its J 
group must be negative. This negative substitution is from the metacyclic 
group and consequently it is negative in the J group of the constituent of 
degree +1. Substitutions from the metacyclic group of J, have cycles on 
letters of each cycle of Ai. Then the letters ae, a3, - - - , dp42 belong in F(a) 
to a transitive constituent of degree 42>+2. Thus the order of F(a:)(a2) 
is t/u, and if x belongs to a transitive constituent of degree 6 in F(a;)(a2), 
the order of F(a:)(a2)(x) is t/(u5). Then t/[(p+2)(p+1)]=t/(ud). If F(x) 
contains no constituent whose degree (>p+2) divides (p+2)(p+1), 

The next difficulty in applying the proof of Theorem 1 arises when we 
consider the constituents of degree +2 which contain »+1 a’s in the groups 
F(a:), F(a2),--+, F(a@p42). However, if c:=c2, the group {F(a:), F(a)} 
contains a transitive constituent on the letters ¢:, ai, d2, +--+, @p42. Thus 
as above, F(a;) has an alternating constituent on the letters ci, d2, d3,---, 


Ap+2- 


* An intransitive group may be regarded as formed from its transitive constituents by establish- 
ing an isomorphism between one transitive constituent and the constituent (transitive or intransitive) 
on the remaining letters, and then multiplying corresponding substitutions. Thus any substitution 
of an intransitive group is the product of substitutions from all transitive constituents (taking the 
identity into account). These substitutions from different transitive constituents which occur as 
factors in a given substitution are said to be associated. For example, in the intransitive octic group 
written out in §13, the substitution (47) is said to be associated with the substitution (58) (69). 


338 M. J. WEISS [April 


We may now follow the proof of Theorem 1 until the statement “if B 
and C coincide.” If B and C coincide, the group { F(a:), F(x) } has a transitive 
constituent of degree 2p+5. Such a constituent is alternating. Now the 
proof (III, §28) that no alternating constituent of H;; involves letters of 
more than one cycle of A; without causing the presence of a substitution of 
order p and of degree <qp in G applies to any intransitive group generated 
by substitutions of order » and of degree gp, thus also to the group { F(a), 
F(x)}. Since transitive groups generated by substitutions of order p and 
of degrees 3p +7, 49+9, 56+11, 69+13 are alternating, we see that if F(x) 
has one transitive constituent of degree p+2 and no transitive constituent 
whose degree (>p+2) divides (p+2)(p+1), it has at least six constituents 
of degree p+2. 

The results of the above discussion may be summarized in 


THEOREM 5. Let H,4: be a primitive group of degree>qp+q (p>2q—3, 
qg>5). If F has a transitive constituent of degree p+2 and no transitive con- 
stituent whose degree (>p+2) divides (p+2)(p+1), it has at least six transi- 
tive constituents of degree p+2. 


The following theorem will also be useful. 


THEOREM 6. Let G be a simply transitive primitive group. If the subgroup 
that fixes one letter of G has a transitive constituent of degree m, it must also 


have another transitive constituent whose degree divides mk;, where k; (21) is 
the degree of a transitive constituent of the subgroup that fixes one letter of the 
constituent of degree m. 


Let G(x) be the subgroup of G that fixes the letter x. Let G(x) have a 
transitive constituent of degree m on the letters a2, ---,@m. Note that 
the theorem is proved if G(x) has a second transitive constituent of degree 
m. It may thus be assumed that G(x) has only one transitive constituent 
of degree m. Now let G(x)(a:) have transitive constituents of degrees hi, 
ko, -- -, k, on the letters de, a3, --~-,@m. The order of G(x) is g/n, if n is 
the degree of G. Then the order of G(x)(a:) is g/(mm), and the order of 
G(x)(a:)(a2) is g/(nmk;), g/(nmkz), - - - , or g(nmk,), according as a2 belongs 
to a transitive constituent of degree ki, ke, ---, or k, in G(x)(a:). Now x 
belongs to a transitive constituent of degree m in G(a:). We know that at 
least one of the transitive constituents of degree ki, i=1, 2,---,v, on the 
letters a2, a3, - - - , dm in G(x)(a;) belongs in G(a;) to a transitive constituent 
of degree r>k;, which does not include the letter x.* Then the order of 


* W. A. Manning, these Transactions, vol. 29 (1927), p. 815. 


1928] PRIMITIVE GROUPS 339 


G(a:)(a2) is g/(mr), if a2 is the letter that occurs in the transitive constituent 
of degree k; in G(x)(a;), and if x belongs to a transitive constituent of degree 
s in G(a;)(a2), the order of G(a:)(a2)(x) is g/(mrs). Then at least one of the 
following equations is true: 


r= mk;/s 


If k=1, a condition that may arise when the constituent of degree m is 
imprimitive, we conclude that r is a divisor of m. 

8. We are now prepared to study the subgroup F of a primitive H,,;. Let 
F be of degree 7p+14 (§4). Since F includes H;, it has at most 5 constituents. 
The partitions of the degree of F are the following: 


6p+12, 
5p+10, 
4p+8, 


p+2 
2p+4 
3p+6 


5p+10, 
4p+8, 
3p+6, 
3p+6, 
4p+8, 
3p+6, 
2p+4, 
3p+6, 
2p+4, 


pt+2, 
2p-+4, 
3p+6, 
2p+4, 
pt+2, 
2p+4, 
2p+4, 
p+2, 
2p+4, 


p+2 
p+2 
pt+2 
2p+4 
p+2, 
p+2, 
2p+4, 
p+2, 
p+2, 


p+2 
pt+2 
p+2 
p+2, 
p+2, 


p+2 
p+2. 


The following partitions are impossible: 6+12, p+2; 49+8, 2p+4, 
p+2; 3p+6, 3p+6, p+2; 26+4, 2644, 26+4, p+2, for in each 
case L has a transitive constituent of degree p+2, paired with itself, 
whose order is less than that of L. By Theorems 2 and 3, the imprimitive 
constituents of L determine the order of L. The multiple isomorphism be- 
tween the constituent of degree +2 and the constituent whose order 
determines that of L follows from the multiple isomorphism between the J 
groups of these constituents. Then all the conditions of Theorem 4 are 
satisfied and consequently L should have a transitive constituent of degree 
(p+2)(p+1). 

Now J; is a transitive group of degree 15. Then dk=15, and d=1, 3, 
or 5. When d=3 or 5, Ji fixes 3 and 5 letters respectively. However, the 
partitions of the degree of F show that J/ fixes one letter only and therefore 
d=1 and k=15. Now the partitions also show that k’ is even. No orders 
will be listed when there are no groups or quotient groups of these orders on 
<8 letters. Then the possible values of 15’ are 60, 120, 240, 360, 720, 2520, 


= 1,2, ---,9). 


340 M. J. WEISS [April 


5040. The order 5040 is impossible, for the symmetric group of degree 7 
contains no subgroup of order 336. 

If 15k’ =60, Ji is of order 4. Since the J group of a constituent of degree 
2p+-4 is octic (IV, §3), the partitions of the degree of F exclude this represen- 
tation. 

Let 15k’ =120. Ji is of order 8. Since the least order of the J group of a 
constituent of degree 4p+8 is 16 (IV, §3), we need consider for this repre- 
sentation only the following partition of the degree of F: 2p+4, 2p+4, 
p+2, +2. From the groups of order 120 on <8 letters, only one distinct 
representation is obtained. This representation is given by the symmetric 
group of degree 5 with respect to its octic subgroup. Ji has three transitive 
constituents of degrees 2, 4, and 8. It is generated by 


{23-4567 -Syux-9zv0, 23-46-82-ou-xv-9y}. 


Then in L this partition of the degree of F becomes 49+8, 26+4, p+2. 
Thus the constituent of degree +2 in L satisfies all the conditions of 
Theorem 4, but L has no transitive constituent of degree (p+2)(p+1). 

If 15k’ =240, Ji is of order 16. However the only group of order 240 
on <8 letters contains no non-invariant subgroup of order 16. 

If 15k’ =360, Ji is of order 24. There is only one representation of the 
alternating group of degree 6 on 15 letters and J/ has two transitive constitu- 
ents one of degree 6 and one of degree 8. The group J/ is generated by 


{25-34-68-79-xy-zu, 28-36-45-79-ox-vu, 29 -37-45-68-0z- yv} 


Now consider the possible partitions of the degree of F for this Ji. A con- 
stituent of degree 3p+6 is impossible, for its J group is of order 18, 36, or 
72 (IV, §3). A constituent of degree 4p+8 is likewise impossible, for its 
J group is of order 16, 32, 128 or greater. Then there is only the partition 
2p+4, 2p+4, p+2, +2, p+2 to be considered. It calls for an invariant 
intransitive subgroup of degree and order 8 in Ji. However all the subgroups 
of order 8 are conjugate in J/. 

If 15k’ =720, Ji is of order 48. There is one representation of the sym- 
metric group of degree 6 with respect to its subgroup {abc, ad, ef}. Ji has 
two constituents of degrees 8 and 6, respectively, in a two-to-one isomor- 
phism. It is generated by 


{246-573-ozv-xuy, 28-39-xz-yv, 23-45-67-89}. 


Now the J group of a constituent of degree 3p+6 in F is incompatible with 
a J{ of order 48. Then the only possible partitions are 4p+8, p+2, p+2, 


1928] PRIMITIVE GROUPS 341 


p+2, and 2p+4, 26+4, p+2, +2, p+2. Neither partition, however, 
allows a two-to-one isomorphism between the constituents of J/. 

If 15%’ =2520, there is one representation of the alternating group of 
degree 7 with respect to its subgroup of order 168. Now this subgroup 
contains substitutions of order 7 and consequently Jj contains substitutions 
of the same order. Then Jj has two constituents of degree 7 or it is transi- 
tive of degree 14. The partitions of the degree of F show that either case is 
impossible. Hence H,,; is not of degree 7p+15. 

9. Let F be of degree 7+13. The partitions of the degree of F are the 
following: 

6p+12, pti 

5p+10, pt+2, pti 

4p+8, 2p+4, pti 

3p+6, 3p+6, pt+l 

4p+8, p+2, pt2, pti 

S3p+6, 2p+4, pt2, pti 

2p+4, 2p+4, 2p+4, 

3p+6, pt2, pt2, p+2, pti 

2p+4, 2p+4, pt2, pt2, 

2p+4, P+2, pt2, p+2, p+2, pti. 
All these partitions have one and only one constituent of degree p+1. Then 
by Theorem 1, LZ should have a transitive constituent whose degree divides 
(p+1)p. Thus H,4: is not of degree 7p+14. 

10. H,4: cannot be of degree 7p+13. If p>13, J: is transitive of degree 
13, but no subgroup of the direct product of a cyclic group of order a divisor 
of p—1 and the symmetric group of degree 7 can be written as a group of 
degree 13. Then p=13. However, la primitive group of degree gp+p does 
not exist unless p<2q—2 (III, §22). 

Similarly if H,,: is of degree 7p+11, p=11, but by hypothesis p>11. 

11. Let F be of degree 7p+11. The partitions of the degree of F are the 
following: 

Sp+10, 2p+1 
4p+8, 3p+4+3 
*Sp+10, pti, 
*4p+8, 2p+2, pti 
T4p+8, 2p+1, p+2 
13p+6, 3643, p+2 
3p+6, 26+4, 2p+1 
3p+3, .2pt+4, 2pt+4 
*4p+8, pt+2, pti, p 


if 
q 


M. J. WEISS 


4p+8, pti, pti, pti 
*3p+6, 2p+4, p 

*3p+6, 2p+2, pt+2, pti 

3p+6, 2p+1, pt+2, p+2 

3p+3, 2p+4, pt2, p+2 

*2p+4, 2p+4, 2p+2, 

t2p+4, 2p+4, 2p4-1, p+2 

*3p+6, pt2, p+2, pti, p 

t3p+6, p+2, pti, pti 
3p+3, pt2, pt2, pt+2, pt+2 
*2p+4, 2p+4, p+2, pti, p 

2p+4, 2p+4, pti, pti, pti 
*2p+4, 2p+2, p+2, p+2, pti 
2p+4, 2pt+i, p+2, pt+2, p+2 
*2p+4, pt2, pt2, p+2, pti, p 
t2p+4, p+2, pt+2, pti, pti, pti 
*2p+2, pt2, p+2, pt+2, p+2, pti 
**2p4+1, p+2, pt2, pt+2, pt+2, pt+2. 


In this and the remaining sections all the partitions of the degree of F 
which are impossible by Theorem 1 are prefixed by the asterisk *. Likewise 
all partitions which are impossible by Theorem 4 are prefixed by the dagger 
T and those which contradict Theorem 5 by the two asterisks **. 

In the present case, among the partitions which remain after eliminating 
those impossible by Theorems 1, 4, and 5, the partition 2p+4, p+2, p+2, 
p+1, p+1, p+1 is also impossible, for F cannot have constituents of degrees 
p+2 and p+1 at the same time without containing a negative substitution 
(see §7). In the remaining cases, partitions of the degree of F which are 
impossible for this reason will be prefixed by the double dagger f. 

Now consider the possible J groups. The 10 partitions of the degree 
of F that remain to be considered show that the order of J/ is even. Then 
if d=1, k=12, and 12k’ =24, 48, 72, 120, 144, 168, 240, 360, 720, 2520, 5040. 
The orders 360, 2520, 5040 are impossible, for there are no groups of orders 
30, 210, or 420 on<8 letters. Since there are no partitions which allow 
Ji to be of order 2, 12k’#24. Neither are there any partitions which allow 
J{ to be of order 4, for the J group of a constituent of degree 2p+-4 is octic. 

If 12k’ =72, the only possible partition is 3p+3, p+2, p+2, p+2, p+2, 
for the J group of a constituent of degree 3p+-6 is at least of order 18. This 
partition brings a substitution of degree and order 3 into J/. The group J; 
is imprimitive, for a primitive J; would be alternating, but if J, is of degree 


q 
342 [April 


>gq, it is not alternating (III, §20). Then J; has systems of imprimitivity 
of 3, 4, or 6 letters and its order is 324, 648, or greater. 

If 12k’ =120, no partition of the degree of F is possible, for the least order 
of the J group of a constituent of degree 5+10 is 50 (IV, §3). If 12k’ =144 
or 720, the only possible partition is again 3p+3, p+2, p+2, +2, p+2, 
but as has been seen this partition is incompatible with the order of J/. 
When 12’ =168 or 240, no partition of the degree of F allows J{ to be of 
order 14 or 20. 

If d=2, k=6, and 6k’ =12, 24, 36, 48, 60, 72, 120, 360, 720. As we have 
seen, no partition of the degree of F allows J/ to be of orders 2, 4, 6, 10, 12, 
20, or 60. 

Now when d =2, J{ is invariant in a subgroup of twice its order and there- 
fore fixes two letters of J;. Then when d>1, J; may be constructed by first 
writing down the transitive representation of K on k letters with respect to 
its subgroup of order k’ and then making it simply isomorphic to itself in d 
different sets of letters and in such a way that the subgroup of order d 
permutes these d transitive constituents cyclically and is commutative with 
each substitution of K. With the above in mind consider the case when J/ 
is of order 8. There are only two possible partitions of the degree of F: 
2p+4, 2p+4, p+l1, p+, p+1, and 26+4, 2p+1, p+2, p+2, p+2. In 
the second partition since the constituent of degree 26+4 gives J/ a constitu- 
ent of degree 4, J{ must have two constituents of degree 4. Thus J/ is of 
degree 8 and this partition is then incompatible with such a J{. The first 
partition is also impossible because all the constituents of degree p+1 
cannot unite in L if J/ fixes two letters. Then ZL has a multiply transitive 
constituent of degree +1 and is impossible by Theorem 1. 

If Jy is of order 120, it has two constituents of degree 5. However no 
partition of the degree of F is possible. 

If d=3, k=4, and J{ is invariant in a group of three times its order and 
therefore it fixes three letters. The only possible partitions of the degree of F, 
4p+8, +1, p+1, p+1, and 2p+4, 26+4, p+1, +1, p+1, have a multiply 
transitive constituent of degree +1 in L, for the constituents of degree 
p+1 cannot unite in L if Jj fixes three letters. These partitions are then 
impossible by Theorem 1. If d=4, and k=3, the only possible partitions 
are the above and again they are impossible. Since no partition of the 
degree of F fixes so many as 6 letters, d+6. 

12. Let F be of degree 7p+9. The partitions of the degree of F are the 
following: 

Sp+5, 2p+4 
4p+8, 3p+1 


1928] PRIMITIVE GROUPS 343 


q 
“4 
ig 


4p+3, 
*Sp+6, 
Sp+5, 
4p+8, 
4p+8, 
*4p+4, 
3p +6, 
*3p +6, 
t3p+6, 
tt3p+6, 
tt3p+3, 
3p+1, 
4p+8, 
*4p+4, 
**4543, 
*3p+6, 
t3p+6, 
+6, 
t3p+6, 
13p+3, 
tt3p+3, 
tt3p+3, 
*3p+2, 
3p+1, 
2p+4, 
2p+4, 
*2p+4, 
{2p+4, 
t3p+6, 
+6, 
{3p+3, 
13p+3, 
*3p+2, 
**3 5+ 
2p+4, 
*2p+4, 
2p+4, 
12p+4, 
t2p+4, 
{2p+4, 


M. J. WEISS 


pt+i 


344 [April 
3p+6 
p+2, pt2 
2p+i1, 
2p, 
2p+4, pti 
3p+3, 
3p+2, pti 
3p+1, pt2 
2p+2, 2pt+l 
2p+4, 2p+2 
2p+4, 2p+4 
p+i, p 
pt2, pri 
pt2, p+2 
2p+2, pti, p 
2p+1, p+2, p 
2p+i1, pti, pti 
2p, p+2, pti 
2p+4, pt2, p 
2p+4, pti, pti 
2p+2, p+2, pt2 
2p+4, p+2, p+i 
2p+4, p+2, p+2 
2p+4, 2p+1, >» 
2p+4, 2p, p+1 
2p+2, 2p+2, pti 
2p+2, 2p+l, pt2 
p+2, pti, 
p+1, pti, pti, p 
p+2, pt2, pt2, p 
p+2, pt+2, pti, pti 
pt+2, pt+2, pti 
pt2, pt+2, pt2 
2p+4, pti, 
2p+2, pt+2, pti, 
2p+2, pti, pti, pti 
2p+1, pt2, pt+2, p 
2p+1, pt2, pti 
2p, pt2, pti 


PRIMITIVE GROUPS 


*2p+2, 2p+2, pt+2, 
**2p+2, 2p+i, pt+2, p+2, 
12p+4, p+2, p+2, 
12p+4, pt+2, 
2p+4, pti, ptt, pti, 
*2p+2, p+2, pt2, pt+2, 
*82p+2, pt+2, pt+2, pti, pt+i 
**2p+1, p+2, pt+2, pt2, 
**2p+1, pt+2, pt2, p+2, p+i 
**2p,  p+2, p+2, pt2, pt2, 

We now delete all those partitions of the degree of F which contradict 
Theorems 1, 4, and 5, and those which cause a constituent group of F to have 
a negative substitution. In the partitions preceded by the two daggers ff, 
Ji has a substitution (IV, §3) of degree and order 3. If J; contains such a 
substitution it is imprimitive, for a primitive J; is alternating, and we know 
that J; is not alternating when its degree >g. The group J; has systems of 
imprimitivity of 5 letters only and its least possible order is 7200. Since a J; 
of this or greater order cannot be written on 7 or fewer letters, these partitions 
are impossible. 

Now consider the possible J groups. If d=1, k=10, and 10k’ =20, 40, 
60, 120, 240, 360, 720, 2520, 5040. Odd values of k’ need not be considered, 
for the partitions of the degree of F show that the order of Jj is even. 
Since there are no groups of order 252 or 504 on<8 letters, 10k’ #2520, 
or 5040. The orders 20, 40, 60, and 120 are also impossible, for the par- 
titions of the degree of F do not allow Jj to be of order 2, 4, 6, or 12. 

If 10k’ =240, the group { abcde, ab, fg} may be represented on 10 letters 
by means of its symmetric subgroup of degree 4. This representation gives a 
Ji with two constituents of degree 4 in a simple isomorphism. The only 
possible partitions of the degree of F are the following: 3f+1, 26+4, 
2p+4; 2p+4, 2p+4, 2p+1, p; 2p+4, 2p+4, 2p, pt+1; 2p+4, 2p+4, 
p, p. However these partitions are all impossible, for they all contain a 
constituent of degree 2+ 4 which demands that an octic group be invariant 
in the symmetric group of degree 4. 

If 10k’ =360, the alternating group of degree 6 with respect to its sub- 
group {abc, def, aebd-cf} gives a doubly transitive J;. Since there are no 
partitions which allow J; to be doubly transitive, 10k’ #360. If 10k’ =720, 
the symmetric group of degree 6 with respect to its subgroup {ab, ac, de, df, 
ad - be - cf} also gives a doubly transitive J;." 

If d=2, k=5, 5k’ =10, 20, 60, 120. We have seen that Jj cannot be of 
order 2, 4, or 12. If 5k’=120, we have a J; with two constituents of degree 


1928] 345 


346 M. J. WEISS [April 


4. Such a Ji group has been seen to be impossible. (See this section, para- 
graph 4.) 
13. Let F be of degree 7p+8. The partitions of the degree of F are the 
following: 
*6p+6, p+2 
*Sp+6, 2p+2 
Spt+4, 2p+4 
4p+8, 3p 
4p+2, 3p+6 
*Sp+6, p+2, 
*Sp+6, p+, 
*Sp+5, pt+2, 
**5p+4, p+2, 
4p+8, 2p, 
4p+4, 2p+4, 
*4p+4, 2p+4+2, 
*4p43, 2p+4, 
t4p+2, 2p+4, 
3p+6, 3642, 
*3p+6, 3p+1, 
13p+6, 32, 
3p+3, 3p+3, 
3p+6, 
3p+6, 
3p+3, 
3p+2, 
3p, 
4p+8, 
**45+4, 
*4p+4, 
*4p+3, 
**45+4-2, 
3p+6, 
*3p+6, 
3p+6, 
*3p+3, 
*3p +3, 
3p+3, 
13p+2, 


PRIMITIVE GROUPS 


3p+2, 2p+4, pti, pti 
**3p+2, 2p+2, pt2, pt2 
*3p+1, 2p+4, pt2, pti 
3p, 2p+4, pt2, pt2 
2p+4, 2p+4, 
2p+4, 2p+2, 
*2p+4, 2642, 
t2p+4, 2p+2, 
*2p+2, 2642, 
t3p+6, p+2, 
3p+6, 
*3p+3, pt+2, 
13p+3, p+2, 
**3p+2, 
**3p4+2, p+2, 
*3p+1, pt+2, 
**35, p+2, 
2p+4, 2p+4, 
t2p+4, 2p+2, 
2p+4, 2p+2, 
*2p+4, 2p+1, 
2p+4, 2p+1, 
{2p+4, 2p, 
t2p+4, 2p, 
**2p+2, 2p+2, 
*26+2, 2642, 
*2p+2, 2p+1, 
**29+-2, 2p, 
**26+1, 26+41, 
f2p+4, p+2, 
f2p+4, p+2, 
2p+4, 
**2p+2, p+2, 
**2p+2, pt+2, 
**2p+2, pt+2, 
*2p+1, p+2, 
**2p+1, pt+2, 
**20, p+2, 
**20, p+2, 
First strike out all the partitions of the degree of F which are impossible 


1928] 347 
q 


348 M. J. WEISS [April 


by Theorem 1. A partition containing a constituent of degree 5p+6 is in- 
cluded in this category, for such a constituent is doubly transitive (II, p. 147). 
Likewise eliminate all those partitions which contradict Theorems 4 and 5 
and those which cause a constituent group of F to contain a negative sub- 
stitution. Then of the original 75 partitions of the degree of F, only 25 
remain to be considered. 

Now consider the possible J groups. If d=3, k=3, and Jj can be of 
order 2 only, but none of the partitions that remain allow J{ to be of order 2. 
Then d=1, k=9, and 9k’ = 18, 36, 72, 144, 360, 720, 2520, 5040. The orders 
720, 2520, and 5040 are impossible, for there are no groups of orders 80, 280, 
or 560 on <8 letters. Odd values of k’ have not been considered, for the 
partitions of the degree of F show that J/ is of even order. Moreover, no 
partition allows J/ to be of order 2 or 4. 

If 9k’ =72, there is only one group of order 72 on < 8 letters which con- 
tains no invariant subgroup in its subgroup of order 8. The group {ab, ac, 
de, df, ad-be-cf} with respect to one of its octic subgroups gives the J/ 

1 
2437 - 5698 
2734 - 5896 
23 - 47 - 59 - 68 
47 - 58 - 69 
27 - 34-59 
24 - 37 - 68 
23 - 56 - 89. 
The possible partitions of the degree of F for such a Jj are the following: 
Sp+4, 2p+4; 4p+4, 2p+4, p; 3p, 2p+4, 2p+4; 3p, 2p+4, pt+2, pt2; 
2p+4, 2p+4, 2p, p; 2p+4, 2p+2, 2p+2, p; 2p+4, 26+4, p, p, p. The J 
groups of the partitions 3p, 2p+4, p+2, p+2, and 2p+4, 26+2, 26+2, p, 
are in multiple isomorphism while the constituents of J/ are in simple 
isomorphism. 

Now consider the partition 5+4, 2+4. The subgroup L of H,+: has 
the same transitive constituents. We shall now apply Theorem 6 to the 
constituent of degree 2p+4. We know (IV, §3) that the subgroup that 
fixes one letter of the constituent of degree 2+4 has a transitive constituent 
of degree 2+2. Thus k:=1, ke =2+2, and r=5p+4. Thus s=(2p+4) 
/(5p+4) or (2p+4)(2p+2)/(5p+4). A moment’s calculation shows that 
these are impossible equations, for s is an integer. 

In an imprimitive constituent of degree 26+4, generated by substitu- 
tions of order p and of degree 29, the invariant substitution in its J group 
fixes the 2 letters of A; (IV, §3). Consequently the substitution from the I 


1928] PRIMITIVE GROUPS 349 


group of another constituent, associated with it, cannot be a substitution 
from the metacyclic group, for the substitutions from the metacyclic group 
have cycles on letters of each cycle of A:.: Now the J group of a constituent 
of degree p in F is the metacyclic group or one of its subgroups. Then in the 
partition 2p+4, 2p+4, p, p, p, F has a substitution of order 2 and degree 8. 
There may be two kinds of substitutions in the J group of a constituent of 
degree 2 or 3p in F: substitutions from the metacyclic group which do not 
permute cycles of A; and substitutions that permute cycles of A;. The latter 
may again be of two kinds: substitutions which are commutative with each 
substitution in A; and substitutions which are the product of these and 
substitutions from the metacyclic group. The latter substitutions have 
cycles on letters of each cycle of A:. Thus in the partition 2p+4, 2p+4, 29, 
p, the invariant substitution of order 2 and degree 8 in J/ either fixes the 2p 
letters of the constituent of degree 2p or is associated with a substitution 
of order 2 and degree 2 from it. The latter is a negative substitution, while 
the constituent groups of F are positive groups. In the partition 2p+4, 
2p+4, 3p, there is likewise a substitution of order 2 and degree 8 or 8+29, 
for the invariant substitution of degree 8 may fix the 3 letters of A; in the 
constituent of degree 3p, be associated with a substitution of order 2 and 
degree 2, or be associated with a substitution of order 3 and of degree 3p 
from it. Thus this partition, 2p+4, 26+4, 3p, is also impossible. 

The systems of imprimitivity of a constituent of degree 2+4 can be 
chosen in only one way and the choice is determined by the transpositions 
in its octic J group. Then the non-invariant substitutions of order 2 and of 
degree 4 in the octic group permute systems of imprimitivity. Such substitu- 
tions cannot fix the 29 letters of Ai, for then the primitive group according 
to which the constituent of degree 2p+4 permutes its systems contains a 
transposition and is consequently symmetric, but since the constituent 
groups of F are positive groups, the group of the systems is also positive. 
Again, since the constituent groups of F are positive groups, a positive per- 
mutation must be associated with the non-invariant substitutions of the 
axial subgroup of the J group. The only positive substitutions in the J 
group of a constituent of degree 2p+-4, on the letters of A; only, are substitu- 
tions from the metacyclic group. Then there must be substitutions from the 
metacyclic group in the J group of a constituent of degree 26+4. Now 
consider the partition 4p+4, 2p+4, p. The group L has transitive constitu- 
ents of the same degrees. In it the constituent of degree p is simply transitive, 
for it cannot be doubly transitive by Theorem 1. Consequently it is a sub- 
group of the metacyclic group.* Since the metacyclic group has only one 


* W. Burnside, Proceedings of the London Mathematical Society, vol. 33, pp. 162-185 


a 
ij 
i] 
j 


350 M. J. WEISS [April 


subgroup of order , the constituent of degree p in F is cyclic. Therefore there 
are no substitutions from the metacyclic group in F, for as we have seen, a 
permutation from the metacyclic group in J has cycles on letters of each 
cycle of Ai. 

Let 9k’=144. There is only one group of order 144 on <8 letters, the 
group {abc, ad, ef, eg}. However, it contains no subgroup of order 16 which 
does not contain the axial group, an invariant subgroup of the group of 
order 144. 

If 9k’ = 360, there is no representation, for the alternating group of degree 
6 contains no subgroup of order 40. 

14. Let F be of degree 7+7. The partitions of the degree of F are the 
following: 

*6p+6, pti 
*6p+5, 
*5p+6, 2p+1 
5p+5, 2p+2 
+t5p+3, 2p+4 
4p+4, 3p4+3 
tt4p+1, 3p+6 
*Sp+6, p+, 
TSp+5, p+2, 
Sp+5, 
*Spt+4, p+2, 
**5p+3, pt+2, 
*Ap+4, 2p+2, 
*4p+4, 2p+1, 
tt4p+3, 2p+4, 
*4p+3, 2p+2, 
*4p+2, 2p+4, 
T4p+1, 2p+4, 
it3p+6, 3p+1, 
*3p+3, 3p+3, 
tt3p+3, 3p+2, 
TT3p+6, 2p+1, 
TT3p+3, 2p+4, 
+13p+3, 2p+2, 
3p+2, 2p+4, 
3p+1, 2p+4, 
*4p+4, 
4p+4, p+, 


ts, 
4p+1, 
+6, 
tt3p +6, 
t13p+3, 
tt3p+3, 
3p +3, 
113p +3, 
3p +2, 
t3p+1, 
3p+1, 
2p+4, 
2p+4, 
12p-+4, 
"2p-+2, 
2p+2, 
tt3p+3, 
tt3p+3, 
an 
3p+2 
**3 0+ 1 
*30, 
2p+4, 
t2p+4, 
2p+4, 
t2p+4, 
2p+2, 
2p+2, 


p+2, 
p+2, 
p+2, 
2p-+1, 
2p, 
2p-+4, 
2p+2, 
2p+2, 
2p+1, 
2p, 
2p+4, 
2p+2, 
2p-+1, 
2p-+4, 
2p-+4, 
2p+2, 
2p+4, 
2p+2, 
2p+2, 
2p+1, 
2p-+1, 
2p+2, 
2p-+2, 
p+2, 
p+2, 
+2, 
p+2, 
p+2, 
p+2, 
p+2, 
2p+2, 
2p-+1, 
2p+1, 
2p, 
2p, 
2p+2, 
2p+2, 
2p+1, 


PRIMITIVE GROUPS 


p+i 


‘pt2 


1928] 351 
p+i1, 
p+2, 
pti, 
p 
pt+2, p 
pt+2, pti 
pt+2, pt+2 
p+i, p 
p+2, 
p+2, p+2 
pt+2, p 
p+2, p+2 
p+2, pti 
2p+1, p 
2p+1, pt+1 

2p, p+2 

2p+2, 
2p+1, p+2 
p, p 
p+2, 
P+, p+i, p 
pt+i1, 
p+2, pti, p 
p+2, p+2, p 
p+2, pt+i1, pti 
p+2, p+2, pti 
p+i, p 
pt+2, p, 
p+1, pt+i, p 
p+2, p 
p+i, pt+i, 
p+2, pti, p 
pti, pt+i, 
p+2, p+2, p 


M. J. WEISS 


*2p+2, 2p+1, 
**2p+2, 2p, p+2, 
*“2p+1, p+2, 
**2p+1, 29, p+2, 
t2p+4, pt+2, pt, 
2p+4, pti, pti, 
**2p+2, pt+2, pt+2, 
**2p+2, p+2, 
2p+2, 
**2p+1, p+2, p42, 
**2p+1, p+2, pt+2, 
**2p+1, p+2, 
**2p, p+2, pt2, 
**2p, p+2, p+2, pti, 

In addition to striking out all those partitions of the degree of F which 
are impossible by Theorems 1, 4, and 5, we shall also exclude all those which 
bring a circular substitution of degree 3 into J{. For, if J; contains such a 
substitution, it is imprimitive of order 288, 576, or 1152. All groups of these 
orders occur for the first time on 8 letters. These partitions will be preceded 
by the two daggers tf. The partitions 5+5, 2p+2, and 5p+5, p+1, p+1 
are also impossible, for if J; contains a substitution of degree and order 5 it 
is alternating, but J; is not alternating when its degree exceeds q. 

The 13 partitions of the degree of F that need still to be considered show 
that the order of J{ is even. Then if d=1, k=8, and 8k’ =16, 48, 144, 240, 
720, 5040. The only possible orders are 16, 48, and 144, for there are no 
groups of orders 30, 90, and 630 on < 8 letters. Moreover, the order 16 
is also impossible, for all of the partitions which allow J/ to be of order 2 
have a multiply transitive constituent in L. 

There are three groups of order 48 on < 8 letters, namely, {ad, ab-de, 
ac-df}, {abc, ad, ef}, and {ab, ac-bd, ef, eg}. The last group contains no 
non-invariant subgroup of order 6, while the first two give a J{ with two 
transitive constituents of degree 3 each. Such constituents are incompatible 
with any of the partitions that remain to be considered. 

The only group of order 144 on < 8 letters, {abc, ad, ef, eg}, contains no 
subgroup of order 18 which includes no non-invariant subgroup. 

If d=2, k=4, and J can be of order 2 only. If d=4, k=2, and k’=1, 
but the partitions of the degree of F show that k’ #1. 

15. If H,,, is primitive of degree 7p+-7, it may lead to a doubly transitive 
group of degree 7p+8. Then J; is transitive of degree 7 and J is multiply 
transitive of degree 8. As we have seen (§3), J: must be the simple group of 


352 April 


1928] PRIMITIVE GROUPS 353 


order 168. Then if J; exists, J/ is intransitive with two cyclic constituents 
of degree 3 each. Now F is of degree 7p+6. The only partitions of its degree 
compatible with J/ are the following: 4p++-3, 3p +3; 3p+3, 3p+-3, p; p+1, 
p+1, p+1, p+1, +1, p+1, p. Any partition containing a constituent of 
degree 2p+3 or p+3 is impossible, for the J groups of such constituents are 
of order 6. 

Now ‘consider the partition 49+3, 3p+3. The constituent of degree 
4p+3 is a simply transitive primitive group (Theorem 1). Its subgroup that 
fixes one letter has the following partitions of its degree: 3p+2, p; 3p+1, 
P+1; 3p, P+2; 26+2, 2p; 2p+1, 26+1; 2p+2, p, p; 2p+1, p+i, p; 2p, 
p+2, p; 2p, p+1, p+1; p+2, p, p, p. Since the J group of 
the constituent of degree 4+3 must be cyclic, all the partitions which 
bring a transposition into it are impossible. Theorem 1 excludes the following 
partitions: 3p+1, +1; 2p+1, p+1, p; p+1, p+1, p, p. This leaves the 
two partitions 2p+1, 2p+1, and 2p, +1, +1. Now apply Theorem 6 to 
the partition 49+3, 3p+3. Let the constituent of degree 4p+3 be the 
constituent of degree m. Then k;=2p+1, 2, or +1, andr=3+3. Thus 
s = (4p-+3)(2p+1)/(3p+3), (4 +3) (26)/(3p +3), or (49-+3)(p+1)/(3p +3). 
Since s is an integer all of these equations are impossible. 

Theorem 6 will also be used to eliminate the partitions 3p+3,3p+3, p, 
and p+1, p+1, +1, p+1, +1, p+1, p. The group Jj demands that for 
these partitions ZL have constituents of degrees 3p+3, 3p+3, p. Now we 
know that a single constituent of degree p in L is a subgroup of the meta- 
cyclic group (§13, paragraph 7). Let the constituent of degree p be the 
constituent of degree m of Theorem 6. Then k;=), say, is a divisor of p—1, 
and r=3p+3. Thus s=pb/(3p+3). Again since s is an integer this is an 
impossible equation. 

Thus a primitive H,,;: of degree 7+7 cannot lead to a doubly transitive 
group of degree 7p+8. 

16. We shall now take up the case p=11. Suppose that F has an alternat- 
ing constituent. Then the subgroup £, (III, §30) exists. Now H,4, is of 
degree not greater than 7p+7 except when £; has a transitive constituent 
simply isomorphic to its alternating constituent of degree » (compare foot- 
note of §4). In this case EZ, has exactly three constituents of degrees p, p, 
p(p—1)/2. Then H,,: is of degree not greater than 7p+14, and F has two 
transitive constituents. An alternating constituent of F cannot involve 
letters of more than one cycle of A; (III, §28). Then F has a transitive con- 
stituent of degree 6+k, k=12 or <6, and one of degree p, +1, or p+2. 
This consideration eliminates F of degree 7p+13, 7p+12, 7p+11, 7p+10, 
and 7p+9. Then if F is of degree 7p+8, the only possible partition of its 


354 M. J. WEISS [Apri 


degree is 6+6, p+2. If F is of degree 7p+7, the only possible partitions of 
its degree are 6+6, p+1, and 6f+5, p+2. These three partitions are 
immediately impossible by Theorem 1. Thus if F has an alternating constitu- 
ent, the degree of H,,; does not exceed 7p+7. Moreover, H,4: of degree 
7p+7 cannot lead to a doubly transitive group of degree 7p+8, for the 
reasoning used in §15 depended in no way upon the value of p. 

The partitions of the degree of F are then the same as when p was assumed 
>11. Now Theorem 5 was the only theorem used in eliminating partitions 
which depended upon the value of . However, all the partitions eliminated 
by this theorem contain a constituent of degree +2 and a constituent of 
degree mp+n(2<m<5, 0<n<13). Since p+2 is the prime number 13, 
the partitions of degree mp+m contain a substitution of order 13 and of 
degree 13m—13 at most. Such a substitution does not respect systems of 
imprimitivity if the constituent is imprimitive. Then the constituents of 
degree mp+n are primitive. Moreover, they are alternating (see theorems 
quoted in §1). However, if F has an alternating constituent which involves 
more than one cycle of A:, G contains a substitution of order 11 and of degree 
<77 (III, §28). Hence these partitions are also impossible when p=11. 

Now H,,4; of degree 7p+11 was shown to be impossible (§10) except 
when p=11. We shall now consider this case. The partitions of the degree of 
F are the following: 

5p+10, 2p 

Sp+6, 

Ap+8, 

4p+4, 

5p+10, 

Sp+6, 

Ap+8, 

4p+8, 

4p+8, 

4p+4, 

3p+6, 

3p+6, 

3p+6, 

3p+6, 

3p+2, 

4p+8, 

4p+8, 

4p+4, p+2, pt+2 
3p+6, 


PRIMITIVE GROUPS 


2p+2, pt2, p 

2p+2, pti, pti 

2p+1, pt2, pti 

2p, p+2, 

2p +4, p+2, 

2p+4, 2642, 

2p+4, 2p+1, 

2p+4, 2p, 

2p+2, 2p+2, 
p+2, pt2, 
p+2, pti, 
p+1, 
p+2, pt2, 
p+2, pt+2, 

2p+4, pt2, 

2p+4, pt, 

2p+2, 

2p+2, p+2, p+i 

2p+1, pt+2, pt+i 

2p, pt+2, p+2 

2p+2, p+2, p+2 
p+2, pt2, p; p 
p+2, p+2, 
p+2, pti, p+i, pti 
pt+2, p+2, p 
p+2, p+2, p+i, pti 
p+2, p+2, 
p+2, pt2, p+2, p+2. 

Since p? does not divide the order of F, J; is transitive of degree 11. Now 
the largest group of order p*(p—1)(q!) on the same letters in which {A;} is 
invariant has just one subgroup of order 2 (III, §22). Then J; has an in- 
variant subgroup of degree and order and consequently is of class p—1=10. 
Consider the partitions of the degree of F. The J group of a constituent of 
degree 5p+10 is of class 5 at most (IV, §3). Thus the only possible partitions 
are the following: 3p+2, p+2, p+2, p+2, p+2; 2p4+2, 2p4+2, p+2, p42, 
P+2; P+2, P+2, P+2, P+2, 2p, P+2, P+2, P+2, pt2. 
However, since +2 is the prime number 13, all of these partitions bring a 
substitution of order 11 and of degree < 77 into G. 

This completes the proof of the case g=7. It has been shown that H,4; 


1928] 355 


356 M. J. WEISS [April 


of degree >7+7 (p>7), does not exist. Moreover, H,4; of degree 7p+7 
can lead to a doubly transitive group of degree 7p+8 only if it is imprimitive. 
Thus the degree of G cannot exceed 7p+8. 

17. It will now be shown that the degree of a primitive group of class >3, 
which contains a substitution of prime order p(p>7) and of degree 6p 
cannot exceed 6+-6. The present limit of 6+ 10 given by Manning depends 
upon the possibility of the existence of a primitive H,,, of degree 6p+9 
(IV, p. 73). Furthermore, H,; of this degree can exist only if the partitions 
of the degree F are 4p+4, 26+4 or 2+2, 26+2, +2, +2. We find that 
the real difficulty lies in trying to eliminate the former partition. 

Before these partitions are discussed, a correction in the list ofjthe 
partitions of the degree of F should be made. The omitted partitions are 
the following: 

F of degree 6p+11: 

2p+4, pt+2, pt2, 
F of degree 6p+-9: 

2p+4, pt+2, pt+2, 

2p+4, pt+2, pti, 

2p+2, p+2, p+2, 

2p+1, p+2, p+2, 

F of degree 6p+8: 

2p+4, pt2, pt2, 


2p+4, p+2, 

2p+4, pt+i1, pti, 

2p+2, pt+2, pt+2, 

2p+2, p+2, pt2, 

22, P+2, pt+2, 
F of degree 6p+7: 


p+2, p+, 

p+1, p+, 

pt+2, pt+2, 

pt+2, pt+i 

pt+2, p+2, pt+i. 
However, all of these partitions except the two following: 2p+4, +1, p+1, 
p+i, p, and 26+4, +1, p+1, +1, p+1, are immediately impossible, 
because either F contains a negative substitution (see §7) or Theorem 5 is 
contradicted. The two partitions which remain are incompatible with any 
of the J groups for H,,,; of these degrees. 

We then turn to the consideration of the partitions of the degree of F 

that cause difficulty when H,,4; is of degree 6+9. There is an incorrect 


p+1, pti 
p+2, 
pti 
p+2, p+2; 


1928] PRIMITIVE GROUPS 357 


statement (IV, p. 72) regarding the partition 3p+2, +2, p+2, p+2. 
However, it may immediately be dismissed from the discussion for it contra- 
dicts Theorem 5. Similarly, the partition 26+2, +2, is in- 
compatible with Theorem 5. Then we need to consider only the partition 
4p+4, 2p+4. 

We recall that the only J{ compatible with this partition is the octic 
group written out in §13 of this paper. Now apply Theorem 6 to this parti- 
tion. Let L(x) be the subgroup that fixes the letter x of H,,:. Choose the 
constituent of degree 2p+4 as the constituent of degree m on the letters 
1, ~*~, Gep4s. We know (IV, §3) that k:=1, and kk=2p+2. Now 
r=4p+4. Then s=(2p+4)/(4p+4) or (29+4)(2p+2)/(4p+4). Since s 
is an integer, the former of these two equations is impossible, and from the 
latter s=p+2. Thus x belongs to a transitive constituent of degree +2 in 
L(a;)(a2). Now L(a:)(a2) contains substitutions of order , for it is the sub- 
group that fixes one letter of the constituent of degree 4p+4. Therefore the 
order of the constituent of degree +2 is divisible by , for if it were not, 
L(a;)(a2) would contain a substitution of order p and of degree <6p. Thus 
the constituent of degree +2 contributes a transposition to J/. 

Let us see what J{ demands of the subgroup L(a;)(a2). First note that 
in this subgroup, the constituent of degree 49+4 in L(a;) can contribute at 
most a transposition toJ{. Jy then demands that the constituent of degree 
2p+4 contribute a substitution of order 2 and of degree 4 to it from this 
subgroup. Since the order of the constituent of degree +2 in L(a:)(az) 
is divisible by », the order of every transitive constituent of L(a,)(a2) is 
divisible by 7, for if it were not, the invariant subgroup generated by all 
the substitutions of order p in L(a:)(a2) would fix the letters of the constitu- 
ents whose order is not divisible by ~, and this subgroup would bring a 
substitution of order 2 and of degree <6 into J,’. Then the possible partitions 
of the degree of L(a:)(a2) are the following: +2, +2, 3p+2, p+1; p+2, 
P+2, 3p+2, p; pt+2, pt+2, 3p+1, pt2; pt+2, pt2, 3p, p+2; p+2, p+2, 
2p+2, 2p+1; pt+2, pt2, 2p+2, 2p; pt+2, pt+2, 2p+2, p; p+2, 
P+2, 2p+2, p, p; H+2, P+2, 2p+1, p; H+2, p+2, 2p, p+2, p+1; 
P+2, P+2, 2p, p+2, p; P+2, p+2, p, p; P+2, p+2, p+2, p, p, 
p. Now L(a;)(a2) has an invariant subgroup of the same degree generated by 
all of its substitutions of order p. The transitive constituents of this invariant 
subgroup are positive groups. Consequently, the partitions which contain 
constituents of degree +2 and p (or p+1) at the same time are impossible 
(see §7). Thus there are only the following partitions: +2, p+2, 3p+1, 
P+2; 3p, p+2; pt+2, 2p+2, 26+1; p+2, p+2, 2p+2, 
2p, to be considered. 


358 M. J. WEISS [April 


Consider the last two partitions first. Now apply Theorem 6 to the group 
L(a,), and let the constituent of degree 4p+-4 be the constituent of degree m. 
Then k;=1, 26+2, 2p+1, or 2p, and r=2p+4. Consequently s=(4p+4) 
/(2p+4), (4p+4)(2p+2)/(2p+4), or (49+4) 
-(2p)/(2p+4). Since s is an integer all of these equations are impossible. 

We also find the first two partitions to be impossible, for the subgroup 
that fixes one letter of the constituent of degree 4p+4 cannot have constitu- 
ents of the degrees given. We shall consider, then, a group of degree 49+4 
whose subgroup that fixes one letter has constituents of degrees 3p+1 and 
p+2 or of degrees 3p and p+2. Let L(y) be the subgroup that fixes the 
letter y of the group of degree 4p+4. Let c1, co, - - - , Cp42 be the letters of 
the constituent of degree p+2 in L(y). Then L(y)(c:) has a transitive 
constituent of degree +1 on the letters ¢2, ¢3, - - + , Cpy2. If the order of 
L(y) is t, the order of L(y)(c:) is t/(p+2) and the order of L(y)(c:)(c2) is 
t/|(p+2)(p+1)]. In L(c), y belongs to a transitive constituent of degree 
p+2. The p+1 letters cs, cs, - - - , Cp42 cannot form with y a transitive 
constituent of degree +2 in L(c:), for, then, the group { L(y), L(c:) }hasa 
transitive constituent of degree +3, which brings a substitution of degree 
and order 3 into Ji. Thus since the p+1 letters ce, cs, - - - , Cp4g Cannot 
belong to the transitive constituent of degree p+2 in L(c:), they must belong 
to the transitive constituent of degree 3p+1 or 3p. Then the order of 
L(c)(c2) is t/(3p +1) or t/(3p) according as L(c;) has a constituent of degree 
3p+ior3p. If y belongs to a transitive constituent of degree s in L(c:)(c2), 
the order of L(c:)(c2)(y) is ¢/[(3p+1)(s)] or #/(3ps). Then s=(p+2) 

- (p+1)/(3p+1) or (p+2)(p+1)/(3p). However, s is an integer. 

Thus it has been shown that a primitive group of class >3 which contains 
a substitution of prime order p (p>7) and of degree 6p does not exist. The 
case p~=7 will now be considered. Theorem 5 was the only theorem used in 
eliminating partitions which depended upon the value of p. The partitions 
thrown out by means of this theorem were 26+1, +2, +2, 
3p+2, p+2, p+2, pt+2; 2p+2, 2p4+2, p+2, p+2; 2p, p+2, p+2, pt+2, 
p+2. Manning has already shown that the third partition is impossible when 
p=7 (IV, p. 78). In the first partition the constituent of degree 2p+1 
(=15) is primitive. Moreover it is doubly transitive, for a simply transitive 
primitive group of degree 15 whose subgroup that fixes one letter has two 
transitive constituents of degree 7 does not exist.* 

This partition is then impossible by Theorem 1. In the second partition 
the constituent of degree 3p+2 =23, a prime number. Then G has a substitu- 


* G. A. Miller, Proceedings of the London Mathematical Society, vol. 28 (1897), p. 540. 


1928} PRIMITIVE GROUPS 


tion of degree and order 23 and consequently its degree does not exceed 25. 
In the last partition, the constituent of degree 2p(=14) is imprimitive, for a 
simply transitive group of degree 14 does not exist,* and a multiply transitive 
group of degree 14 is impossible by Theorem 1. Now the constituent of 
degree 9 is at least triply transitive and consequently is either the Mathieu 
group of order 504 or the group of order 1512 and of class 6. The constituent 
of degree 14 has systems of imprimitivity of two letters only. Its group in the 
systems is a primitive group of degree 7. Then the group of degree 9 cannot 
be simply isomorphic to this group in the systems, for these groups of degree 
9 occur for the first time on 9 letters. The Mathieu group is then impossible, 
for it is a simple group. The only invariant subgroup of the group of order 
1512 is the Mathieu group of order 504. Then the group in the systems of the 
constituent of degree 14 must have a quotient group of order 3. However, 
since the constituent of degree 14 contains more than one subgroup of order 
p, its group in the systems cannot have such a quotient group. 
Thus the theorem for the case g =6 now reads 


The degree of a primitive group of class >3 which contains a substitution of 
prime order p(p>7) and of degree 6p cannot exceed 6p+-6. If p=7, the true 
limit of the degree of Gis6p+7.' 


* G. A. Miller, Quarterly Journal] of Mathematics, vol. 29 (1897), p. 242. 


STANFORD UNIVERSITY, 
StanForp UNIversirty, CALIF. 


359 


GENERALIZED LAGRANGE PROBLEMS IN THE 
CALCULUS OF VARIATIONS* 


BY 
C. F. ROOST 


I. INTRODUCTION 


In the new dynamical theory of economics there arises a very general 
problem which can be said to be a generalization of the Lagrange problem in 
the calculus of variations.{ It will not be necessary to consider the formu- 
lation of the corresponding economic theory here since I have already done 
this in another paper.§ It would hardly be fair, however, to introduce the 
reader to a rather unusual mathematical situation without giving some hint 
as to its origin. It seems desirable, therefore, to give first a brief economic 
formulation of the problem whose mathematical aspects will be discussed 
in this paper. 

If there are two producers of an identical commodity C, manufacturing, 
respectively, amounts (x) and w(x) of C per unit time, subject to the 
respective cost functions ui, u2, us, Us, UZ, x) aNd ui, Ue, Ud, 
Us, Us ,X), Where u3(x) is the selling price of C at a time x, then the respective 
profits obtained during an interval of time x» <x <4; are 


% 
i= f — U3, Ug , x) |dx, 
Zo 


| 
Ze 


where @; and ¢2 are assumed to be continuous with their first and second 
derivatives with respect to all their arguments, and primes denote derivatives 
with respect to time x. 

The rates of production u(x) and u(x) and the price u(x) will satisfy an 
equation of demand which in the general case will be of the form 


* Presented to the Society, December 31, 1926; received by the editors December 2, 1926. 

¢t National Research Fellow in Mathematics. 

} For a special example of this problem, see C. F. Roos, A mathematical theory of competition, 
American Journal of Mathematics, vol. 47 (1925), pp. 163-175. See also G. C. Evans, The dynamics 
of monopoly, American Mathematical Monthly, vol. 31 (1921). 

§ C. F. Roos, A dynamical theory of economics, Journal of Political Economy, vol. 35 (1927). 
See also Roos, Dynamical economics, Proceedings of the National Academy of Sciences, vol. 13 (1927). 


360 


GENERALIZED LAGRANGE PROBLEMS 


(1) G(m, ui ,x) = f P(m, +, Us, x,s)ds 


where G and P have continuity properties similar to those of ¢: and ¢:2.* 
Each manufacturer will consider his rate of production to be influenced by 
the rate of production of his competitor only through the equation of demand, 
and will desire to determine his own rate of production in such a way that 
he obtains a maximum profit over some interval of time, say x» <* <x. 

The problem of competition for this state of affairs will then be the 
problem of determining a curve T in the space (m4, we, us, x), satisfying a 
functional equation (1), such that an integral J,, taken along T from x» to 
21, is a maximum when 1 is momentarily held fixed, and such that a second 
integral J2, also taken along T' from 2 to x;, is a maximum when ™ is mo- 
mentarily held fixed. In the usual case the initial time x» and the correspond- 
ing initial values of the u;, i=1, 2, 3, are fixed. The end time x; and the 
corresponding end values of the u; may be regarded as fixed or not, depending 
upon the nature of the problem under consideration. Both cases will be 
considered at some length in the following paragraphs. 

For the particular case P=0 the equation of demand becomes simply a 
first-order differential equation. For this case the problem of competition 
can be solved by the methods employed in the classical Lagrange problem 
in the calculus of variations.| In order to obtain a solution in the classical 
way we need, however, two sets of Lagrange multipliers, and this makes 
the problem quite difficult. In the following pages I shall give an analysis 
for the case in which the rates of production and price are related by a 
differential equation of demand G(m, ui , v2, ud , us, us , x) =O without using 
multipliers, and shall obtain necessary and sufficient conditions. These 
conditions, although functional in character, seem simpler than the corre- 
sponding conditions which would be obtained by the classical analysis. 

In discussing the Lagrange problem for several differential equations 
Gilt, Un, Un, x) =0, R=1,--+-, m<n, I introduce the theory 
of Volterra integral equations into my analysis to replace the classical 
theory by means of multipliers. This use of the theory of integral equations 
enables me to obtain a method for solving the more general problem for which 
Pi(m, ul, +++, Un, Un, xX, 5)4%0. So far as I know this use of integral equa- 


* Roos, Dynamical economics, loc. cit. 

t J. Hadamard, Legons sur le Calcul des V ariations, pp. 217 and sequence. See also G. A. Bliss, 
The Problem of Lagrange in the Calculus of Variations, lectures given at the University of Chicago, 
summer quarter 1925, mimeographed by O. E. Brown, Northwestern University, Evanston, Illinois. 


362 C. F. ROOS [April 


tions is entirely new. As a result the following exposition, although lengthy, 
does not represent a complete treatment of the subject. 


II. FIXED END POINTS. EULERIAN EQUATIONS IN FUNCTIONAL FORM 


1. Geornetrical interpretation of the problem. In order to make our 
analysis easier to follow let us first examine the problem for which both 
end points are fixed, and for which P(m, u/,---, x, s)=0, from a geo- 
metrical view point. In the hyperspace (1, we, us, x) let w2=we(x) be any 
function, continuous with its first derivative, and substitute this value of w2 
in the integrand Fi(m, uy, - - - , us , x) of an integral J;, corresponding to the 
I, of the introduction, and in the differential equation G=0. The function 
F, becomes a function Fi ui , u2(x), ud (x), us, us , x), and G=0 becomes 
a differential equation G(m, uy, u(x), ud (x), us, us, x) =0. The problem 
of finding “,=y,(x) which maximizes J; is thus reduced to the problem of 
finding a function y,(x) which maximizes 


f Fi(uy, ,Uo(x), ud (x), ,x)dx, 


and satisfies G=0 and given end conditions whatever they may be. 

Again, if «:(x)=¥y.:(x) be substituted in the integrand F, and in G=0, 
these become, respectively, Fi(y:(x), yi (x), ue, ud, Us, us, x) and G(y(x), 
yi (x), te, us, Us, Us, X)=0. Choosing the function u(x) =y2(x) so that it 
satisfies G=0 and maximizes 


f F2(yi(x), yi (x), ue, ,x)dx 


completes the solution of the problem, for u, and u; have already been 
determined in terms of u2(x). It is important to note that we have assumed 
the existence of a solution without showing that one actually exists. Condi- 
tions for the existence of a solution will be discussed in Part IV of this paper. 

2. Admissible arcs and variations. An arc u;=u;(x), i=1, 2, 3, which 
is continuous on the interval x»<x<4;, and is such that the interval can 
be divided into a finite number of subintervals on each of which the functions 
u;(x) have continuous derivatives up to and including those of the second 
order will be called an admissible arc. This definition will permit a maximizing 
arc to have a finite number of corners. All of the elements of an admissible 
arc shall be required to lie in a simply connected region of a hyperspace 
(u;, U2, Us, x), and to satisfy the differential equation ui, ud, Us. 


1928] GENERALIZED LAGRANGE PROBLEMS 363 


us , x) =0, and, furthermore, to satisfy certain end conditions.* In the follow- 
ing paragraphs all admissible arcs will be regarded as fixed at a fixed 2p, i.e. 


(2) = U2(Xo) = = Uso, 


and either variable or fixed at x, depending upon the particular problem 
under consideration. The behavior of the arcs at x, will be pointed out as 
the work progresses. 

If a two-parameter family of admissible arcs u;=u,(x, a, 6) containing a 
particular admissible arc I’ for the parametric values a=b=0 be given, 
we shall call the functions 


= dui(x,0,0)/da, = Ou2(x, 0, 0)/db 


partial variations of the family along T. Ordinarily we would require a three- 
parameter family to cover the space (wm, #2, “3s, x) completely, but the 
differential equation G=0 and the initial condition u3(xo) = uso removes one 
degree of freedom. 

3. The Eulerian equations in functional form. Let us write 


Ua = Ya + Ya(x,a,b) (a = 1,2), 
= + O(x,a,b), 


where the y. are functions, continuous in x, a and 8, possessing continuous 
derivatives of the first order with respect to x, a and 6 and vanishing when 
a and 6 vanish. The functions y, and ys are the functions u;(x), 7=1, 2, 3, 
defining the maximizing curve I which we suppose for the present to exist 
a priori. 

In our analysis we shall have to require that the functions Fi, F, and G 
possess continuous derivatives of the second order with respect to each of 
the arguments uj, u’, x, i=1, 2, 3, and, furthermore, that 0G/duj ~0 
in the interval x»S*x<%;. Under these hypotheses the function @ is deter- 
mined by G=0 and the first two equations of (3), except for an arbitrary 
constant, as a continuous function of x, a, b with continuous derivatives 
of the first order. 

The derivatives 00/da and 00/db satisfy the equations of partial variations 


+ /da + (dG/dus)d0/da + )d6’/da = 0, 
(8G/ Aus) dp2/db + (dG/dud dps /db + (AG/du;)d0/db + )a0’/ab = 0, 


and will, therefore, also be continuous and have continuous partial derivatives 


(3) 


* See Bliss, loc. cit., p. 3. 


Fy 


364 C. F. ROOS [April 


of the first order, on account of the continuity requirements on G. We 
further restrict the y. by the following conditions: 


= = 0, 
Op2/da ™ 0, Op2/db o(x), 


when a=b=0. We employ the following notation: 00/da=0,(x) and 
00/3b =0,(x) when a=b=0. 

Since we have assumed the end values of ™ and m: to be fixed at x, 
as well as at xo, we can write 


E1(x0) = = = = O. 


For the parametric values a=b=0 the function @(x, 0, 0) =6{x) must 
satisfy the differential equations of partial variations 


(4A) (8G/Au;)E, + + (AG/Ou3)0. + 
(4B) (dG/Ou2)E2 + + + = 


The first of these determines 0, in terms of £, and the partial derivatives of 
G with respect to “ and u;, except for a constant, whereas the second 
determines @, in terms of & and the partial derivatives of G with respect 
to uz and u', except for a constant. Choosing these constants so that each 
of the partial variations 0,.(%o) and @(xo) vanishes implies that the total 
variation of the function @ be zero at %o, ie. 60=0,5a+0,5)=0 at x=xp. 
Conversely, since the Yq are arbitrary, the vanishing of 54 implies the 
vanishing of both @, and 6». The equations (4) and the initial conditions (2), 
therefore, completely determine the variations of u;. The functions ~ and 
ut have thus been classified as independent functions in a manner similar 
to the way in which variables are classified in the ordinary theory of maxima 
and minima of functions. 

If functions u;(x, a, 6), defining a two-parameter family of admissible 
arcs containing [ for the parametric values a=) =0, are substituted in J. 
this integral becomes a function of a and bd defined by 


Ih(a,b) = f Fi(ui(x,a,b), uy (x,a,b),--- , uz (x,a,b),x)dx. 
Ze 


The partial variation of this integral with respect to a reduces to 


= + (OF + (OF 


for a=b=0. 


1928] GENERALIZED LAGRANGE PROBLEMS 365 


Instead of proceeding in the classical way we shall solve the differential 
equation (4) for 6, and develop a theory without the use of Lagrange mul- 
tipliers.* This procedure seems to be more directly an extension of the ordi- 
nary theory of maxima and minima; it allows us to obtain the Weierstrass, 
Legendre and Jacobi conditions by an analysis which is simpler than that used 
in the classical theory, and, furthermore, it leads to a method for solving the 
Lagrange problem when the differential equations are replaced by functional 
equations of the type (1). We proceed as follows: 

Since by hypothesis 0G/dyJ is not zero in the interval x» <x <~%,, the solu- 
tion of (4) for 0, is 


(5) 0, = + lat, 
Zo 
where the following notation has been introduced: (0G/dy;)/(dG/dy; ) 
=—0Gi/dy3; ) = = — 
/dy! ; Vi=J? (OG! /dy,)ds. The assumption =0, made above, does 
not necessarily impose a limitation on this method, for, if ws were variable at xo, 
the solution for 6, would be the solution above plus the variation of uz at %o. 
Differentiation of (5) with respect to x determines 0/ by the formula 


(6) 62° = + 
When the values of 6, and 6/ as given by (5) and (6) are substituted in 


the expression defining the partial variation of J, with respect to a, it becomes 
for a=b=0 


(dI,/da)éa -f 


z 


[AF + (AF 1/dys + /dy! 


+ + [AF 1/dys + (AF 1/dys /dys] f /dy1)E1 


(Gj /dyi ja] dx. 


An application of Dirichlet’s formula for changing the order of integration 
of an iterated integral, followed by an interchange of ¢ and x, the parameters 
of integration, yields the equation 


* Hadamard, loc. cit., Chapter VI, gives the classical theory. 


366 C. F. ROOS 


(a1,/da)sa = f 


| [AF ,/dy; + (AF 1/dys )dGJ/dy1 + 
+ [dF + (AF \/dys + 


W,= f eV [AF + (AF 


Since £,(«) vanishes at x9 and x, by hypothesis, an integration by parts 
performed on the terms involving £,(~) of the partial variation of J, with 
respect to a furnishes the expression 


(al; = f |- f + + /ay:) Ws 


+ [aF,/dyi + (OF + (9G: /ayt (x)dx, 


where the coefficient of £/ (x) is continuous because of the continuity require- 
ments on F,; and G. 

If J, is to be a maximum along the curve I, it is necessary that (0/;/da)éa 
be zero for all values of the functions £,(x). By a well known theorem of the 
calculus of variations it follows that the coefficient of £/(x) must be a 
constant, that is, 


7) OF ,/dyi (OF /dyi + (0Gj/dyi) Wi 
f + + + Cr, 


where C, is a constant to be determined by the initial conditions. 
An entirely similar analysis applied to J: yields the necessary condition 


(8) OF + (OF + 


- f [AF :/dy2 + (AF + + Co. 
Zo 

The functional-differential equations (7) and (8) are the analogues of the 

Euler equations in the Du Bois-Reymond form.* Wherever the maximizing 

curve I has a continuously turning tangent we can differentiate (7) and (8) 

with respect to x and obtain functional-differential equations which involve 


* Du Bois-Reymond, Mathematische Annalen., vol. 15 (1879), p. 313. 


[April 

where 

| 


1928] GENERALIZED LAGRANGE PROBLEMS 367 


second-order derivatives and which are the analogues of the Euler equations. 
We can, therefore, state the following theorem. 


THeoreM 1. In order that an admissible arc T in the space (1, U2, Us, X), 
satisfying a differential equation G(ux, ui, U2, Us, Us, Us, xX) =O and initial 
conditions u;(xo) =uUio, =U, maximize an integral I, when uz is not 
allowed to vary and at the same time maximize a second integral I, when u, is 
not allowed to vary, it is necessary that this curve satisfy the functional-differential 
equations (7) and (8). If the maximizing curve has a continuously turning tan- 
gent at x, Xp Sx SX, it must satisfy the equations 


d 
x 


+ (AF ,/dys + =O (k = 1,2), 
obtained by differentiating (7) and (8) with respect to x. 


Functional-differential equations of the type (9), with our form of W,, 
have not been discussed in the literature. It would be desirable to be able 
to say that a unique solution of these equations plus the differential equation 
G=0 exists whenever end values =uio and u;(x:)=ua are given. 
This problem will not be discussed in the present paper.* It may bementioned, 
however, that I have already exhibited a special example for which the system 
(9) reduces to a system of Volterra integral equations, and have actually 
found the solution.t Let us examine (7) and (8) from a different point of view. 

In particular if F; =F, the problem reduces to a strict Lagrange problem. 
No assumption which would prevent this has been made, hence we have 
the following 


CoroLtary. The equations resulting from (7) and (8) by putting Fi=F; 
must be satisfied by a curve satisfying a differential equation G=0 and initial 
conditions u(xo)=uUio, Ui(%1) =U if this curve is to maximize an integral 


I= f Fi(u1, uj , U2, Us ,U3, Us ,x)dx 


in which both u, and uz vary independently. 


*L. M. Graves, Implicit functions and differential equations in general analysis, these Trans- 
actions, vo!. 29, pp. 515-552, gives imbedding and existence theorems for a system which includes 
(9) as a special case. If +,’’’ is continuous, we can reduce (9) to a differential equation of the third 
order by a differentiation, because of the form of Wx, and existence theorems for differential equa- 
tions will apply. 

t Roos, A mathematical theory of competition, loc. cit., p. 167. 


368 C. F. ROOS [April 


The methods of this part can be extended without difficulty to the case 
for which there are integrals 


| 
% 
and one differential equation G(m, u{,---, Un) Un, =0, in 
which case functional equations of the type (7) result. 


III. VARIABLE END POINTS. ANALOGUES OF WEIERSTRASS AND LEGENDRE 
CONDITIONS 


4. Problem with one end point variable. In the preceding paragraphs a 
problem in simultaneous maxima for fixed end points has been considered. 
The problem is even more interesting when one end parameter, say x;, and the 
corresponding end values are allowed to vary. 

Consider the problem of determining a curve I in the space (m4, ue, us, 
us, x) satisfying a differential equation 


G(u1, »U2,U2 ,U4, U4 , x) = 0 


such that an integral 


% 
n= f , 5 ,x)dx 


is a maximum when 4; and wu are allowed to vary independently, but not us, 
and such that a second integral 


I, = f , ,x)dx 
is a maximum when 4; is allowed to vary independently, but not ~ and ws. 
We assume the end parameter x» and the end values u;(xo) =o to be fixed, 
and the end parameter x; and the corresponding end values of the u; to be 
variable. Let us assume as we did in Part II that 0G/du/ +0 for the region 
which contains admissible arcs u;(x), i=1, 2, 3, 4, and that the functions 
F., a=1, 2, and G are continuous in 1%, u2, Us, Us, Ur , Us , Us , Us , X and have 
continuous partial derivatives of the first order with respect to these argu- 
ments. 

5. Functional transversality conditions. In the functions F, and G 
replace the functions u;(x), i=1, 2, 3, 4, by a set u;=fi(x, 1, d2, d3), where 
the f; are functions of x and parameters 4), a2 and a3, continuous and admitting 
continuous derivatives up to the second order with respect to x and these 
parameters in the domain 0<a,5h; OSa;5h; The 


1928] GENERALIZED LAGRANGE PROBLEMS 369 


functions fi, fz and f; are otherwise arbitrary, but f, is determined by G=0 
and the initial condition u4(xo) = 149. Let the limit of integration x; be a sim- 
ilar function of the parameters a,, ¢=1, 2,3, i.e. de, ds). 

By the ordinary rules of differentiation the differential of the integral J,, 
which is also a function of the a,, is for us (= constant) 


(10) = , us,ug ,x)dx]™ 


+ f (OF ]dx, 


where 6f;= ;/da:)5a: + (0f;/da2)5a2+ (Of;/das)5a3 and 7 is an umbral index 
for the values 1, 2, 4, but not for 3, according to the convention that whenever 
a literal suffix appears twice in a term that term is to be summed for values of 
the suffix.* The variation of u; in F; is by hypothesis equal to zero, hence 
ofs = 0. 

As already stated the variations of f; and f2 are to be arbitrary (except for 
continuity properties), but we can not take the variation of f, to be arbitrary, 
for it is determined by the differential equation of partial variation 


+ (0G/dug + (0G/dux)5fr + (dG/dux =U, 


where & is an umbral index taking on the values 1 and 2 only. Since dG/du,’ 
does not vanish and is continuous in the interval by hypothesis, and, further- 
more, since 65(df/dx) =(d/dx)6f, this expression can be regarded as a first- 
order differential equation for the determination of 5f, in terms of 5f; and 
and the initial value of df, at x =x. Since we have supposed 6f,(xo) =0, 
we may write 
if, = f /dux)ofz + \dt, 

where the expressions of the form 0G; /du,, etc. have meanings similar to 
the corresponding ratios defined in (5). As in (5) we determine the value of 
5f/ by differentiation of the above expression. If the values of 6f, and 6f/ 
so found be substituted in (10), it becomes 


f "| + Jafe 
[AF 4 (OF ;/dud ) ) 


+ [dF (AF )(8G/ /dus) | f eV «[(8G{ Ja la. 


* See A. S. Eddington, The Mathematical Theory of Relativity, p. 50. 


370 C. F. ROOS [April 


An application of Dirichlet’s formula to the iterated integral followed 
by an interchange of the parameters x and ¢ as before reduces the above 
formula for 6J, to 


(10B) = + + (OF :/dug )(0G{ /dux) 


+ ex, 


71 
Since the f;(x%, a1, d2, a3), k=1, 2, have by hypothesis continuous second 
derivatives with respect to x, the formula for integration by parts can be 
applied to the second member of (10B), so that* 


ny = [aF,/dug + (OF ,/duf )(dG{ /dug ) | 


+ f | arv/an + (OF )(6Gj/duz) + (0Gi/dux)W, 


d 
[AF ,/dug (OF ,/dug )+ /dug ) W, i 
x 


where & is umbral as before. 

By definition 6f,=(0F;/da;)5a;, where i is umbral with range 1, 2, 3, 
hence the variation of u, is given by 6u,=u¢ bx+6f,. If the value of Of. 
defined by this equation be substituted in 6J,, the following formula results: 


f (OF ,/du{ /dux) + (6G4 /dux) Wi 


d 
x 


We define an arc u;=u;(x), i=1, 2, 3, 4, as an extremal arc if it has con- 
tinuous derivatives du;/dx and d?u;/dx? in the interval x»Sx<m, and if, 
furthermore, it satisfies the differential equation G(m, ui, - - - , us, ud , x) =0, 
the set of two equations 


* Hadamard, loc cit., p. 60. 


where 
| 
> 


1928] GENERALIZED LAGRANGE PROBLEMS 


d 
(11) OF ,/du, + (AF 1/dud /dux) + — oF aus 
x 


+ (AF )(8Gi/dut) + (dGi/duz = 0 (k = 1,2), 


and a similar set for the integral Js. 
If an extremal I is to rnaximize J, for u; constant, it is necessary that the 
differential 


61,(T) = + [OF + (AF — uf 


vanish for all possible choices of 5x; and 6u,(x1),  =1, 2. We can thus state 
the transversality theorem: 


THEOREM 2. If for an admissible arc T, one of whose end points is fixed 
at xo while the other varies over a V3 defined by us=constant and a differential 
equation G=0, the value I(T), for us=constant, G=0, is a maximum with 
respect to the values of I, on neighboring admissible arcs, issuing from the same 
fixed point 0, then at the intersection point 1 of T with Vs, the directional 
coefficients of V; and the element (wu, u{,---, us, x) of T must satisfy the 
relations 


12) ug ,x) — [AF (OF )(0G{ ) = Q, 


OF ,/du! + (AF = (i = 1,2). 
If we apply a similar analysis to the integral J2, for u, and u2 constant, we 
obtain a differential 


= Fobx, + [OF 2/dus + (AF 2/dus )(OG{/dus’) | — ug (x1)521] 


along an extremal for the integral J2. If, therefore, T is also to maximize 
I; for u, and u2 constant, then at 1, the intersection of T with V2, defined by 
u,=constant, u#2=constant and G=0, it is necessary that the equations 


, ,x) + [OF 2/dus’ + (OF = 0, 
OF + (OF )(0G{ /dugz ) = 0 


* From a consideration of the classical theory of the Lagrange problem with second end point 
variable we would expect to have four transversality conditions instead of three as given by (12), 
but we have not used the condition that 5u4(x) is arbitrary, since it is a function of arbitrary func- 
tions 5f;, and hence we lack this condition. If we perform an integration by parts on the term in 
éf;’ of (10), and then substitute for 5f; as we did above, we obtain a term (0F;/du4’)5f, besides terms 
in 5f,. Since 5f,is arbitrary at x it follows that 0F;/du,’=0 at x=. We may, therefore, by the help 
of (12) write the transversality condition as 


(12A) Fy(m, ++ us’, x) — = 0, 
OF, /du,’ = 0 (k = 1, 2, 4). 


The equations (12A) are the analogues of the usual transversality conditions. (See Bliss, loc. cit., p. 
167.) 


(13) 


371 


372 C. F. ROOS [April 


hold. In equations (12) and (13) we have five equations for the determi- 
nation of the four end values x, u,(x:), A=1, 2, 3, of an extremal T. In 
general, therefore, the problem of simultaneous maxima is not possible for 
the case for which the second end parameter «x, is required to be the same for 
both J; and Jz. Hence, it will be understood in our work that x; has, in general, 
different values for J; and Jz. The conditions (12) and (13) are functional 
in form and will be called functional transversality conditions. 

6. Analogue of the Weierstrass necessary condition. By the aid of the 
expression for 6/,(I’) we can state the following theorem: 


THEOREM 3. The value of an integral I,, taken along a two-parameter family 
of extremal arcs Em determined by the equations u,=f;(x, a1, a2), k=1, 2, 4, 
G=0, and the hypersurface us=f3(x) =constant, one of whose end points, xo, 
is fixed while the other, x:, varies, has a differential 


= Fi(u1, pi, Ue, po, Us, Us , M4, Ps, 


+ [dF + (AF — 


where at the point 1, the differentials dx, and du,are those belonging to V3 (us =con- 
stant) described by the end points of the extremals, while the ui, us, pi and ug 
refer to the extremal Em. The functions F, and G have arguments (uw, p1, Ue, 
p2, Us, Us , Us, Ps, x), where the pr, p2 and p, are the directional coefficients of 
the extremal Ey, for us=constant. 


There is an entirely analogous theorem for the integral J, For J: the 
functions F, and G have arguments (a, ui , Ue, Ud , Us, Ps, Us, Pa, X). 

The integral of dZ, corresponds to the Hilbert integral and possesses sim- 
ilar properties. In a manner analogous to the classical method of the calculus 
of variations it is possible to obtain the necessary conditions of Weierstrass 
and Legendre and to obtain sufficient conditions for relative strong and 
weak maxima.* To do this we first define an extremal field in the sense in 
which we shall use it in this chapter. 

We shall say that a connected region R of the space (m1, te, Us, U4, x) 
is a simply covered extremal field if there exists a family of extremals de- 
pendent upon three parameters such that one and only one extremal of 
this simply covered field passes through every point of R, and if, furthermore, 
the directional coefficients du,/dx=p:(t, -- +, x), h=1,---, 4, of the 
tangent to the extremal, which passes through the point (m, we, ts, Ms, x), 
are continuous functions, admitting continuous partial derivatives in R 


* For the classical analysis see Hadamard, loc. cit., p. 364. See also Bliss, loc. cit., p. 50. 


1928] GENERALIZED LAGRANGE PROBLEMS 373 


up to the second order. We shall assume that such a field exists and that it 
contains V3. 

It is quite evident that along an extremal arc of a field, the integral 
I,* = fdI, has the same value as J,, for 5u;= p,6x along an extremal, and the 
integrand of J,* thus reduces to the integrand of /,. 

To obtain an analogue of the Weierstrass condition we select a point (3) 
on Ey, the extremal which we are assuming to give the desired maximum, and 
through this point (3), holding a1, a2, @s3)=«s(x) constant, pass an 
otherwise arbitrary curve Ci: with continuously turning tangent in R. We 
note that R may be partly bounded by V3, so that when (3) is at (1) the curves 
Cy are further limited. Such a curve Cy will have equations u,=U,(2), 
U2 = U3 =1u;(t), U, (2). 

We join the fixed point 0 to a movable point 2 on Cy by a one-parameter 
family of arcs Eq, containing Eo, as a member when the point 2 is in the 
position 1. We choose the parameter ¢ on Cx increasing as 2 moves towards 
1, noting that the arc length s is a possible ¢. If Eq is to give a maximum 
for admissible arcs in R, i.e., I1(Eo2+Cn) $J:(Eu), where Ci and Eg: are 
obtained by putting a;=0, it follows that 
must be 20 for 2 sufficiently close to 1, and in particular at 1 itself, that 
is, Em) $0 must hold. 

This differential is given by the value at the point 1 of the expression 


F\(U,, Ui, U2, Uz ,us, us , U4, 
Fi(t, ui, ue, us U3, , U4, ud , x)bx 
— [AF + (AF ) | — ug dx], 


the differentials in this expression belonging to the arc Cy. and, therefore, 
satisfying the equation 6u,=U; 6x. At the point 1 the codrdinates of Cy 
and Eq are equal, so that this expression can be written as 


dI\(Ci2 Eo2) => [Fi(m, Ui Us U3, Ug 
— U3 ,U4, Ud , x) 
— [Ud — uf ] [OF + (AF )] 


We shall call the coefficient of 5x in the above expression £,, because it 
is an analogue of the Weierstrass E-function.f Since the differential 
dI,(Ciz—E:) must be negative or zero for an arbitrarily selected point 1 
and an arc C through it, we have the following theorem: 


t Bliss, loc. cit., p. 130. 


374 Cc. F. ROOS [April 


THEOREM 4. At every element ui, , Us, Ud, x) of an arc En which 
maximizes an integral I, when uz is not allowed to vary and which satisfies a 
differential equation G(um, ui,--- , ud, x)=0, the Weierstrass condition 


must be satisfied for every admissible set (uj, U1, ue, Ud, us, us , U4, UL), 
different from (uy, Uz , U2, Us, Us, Us , Us, Us, X), for all values of the codrdinates 
(41, te, Us, Us, x) im the region R. 


A similar analysis applied to the integral J, yields the following theorem: 


THEOREM 5. At every element uf , Us, ud, x) of an arc Eq which 
maximizes an integral I, when u, and uz are not allowed to vary and which 
satisfies a differential equation, G =O, the condition 


(15) ui ,U2, ,U3,u3 ,Uzs ug ,U{ ,x) 
must be satisfied for every admissible set (u1, uf, U2, ud, Us, Us, us, Ud, x) 


different from (ui, Ui , U2, Ud , Us, Us , U4, Ud, X), for all values of the codrdinates 
(t41, t2, Us, Us, x) in the region R. 


The conditions (11), (12), (13), (14) and (15) are necessary conditions 
which must be satisfied by an arc Eo; which furnishes a solution of the prob- 
lem of this paper. In the following paragraph another necessary condition 


will be obtained. 

7. Analogue of the Legendre necessary condition. For brevity we will 
consider only the first integral J,. If the function Fi(m#, U/, ue, Ud, us, ud, 
uy, Uj, x) be expanded by means of Taylor’s formula, the following ex- 
pression is obtained: 

Ut ,u2, Ud ,u3,u3 ,x) = Us us , us, Ud , x) 
+ [Us — ud + (OF ) | 
+43 [Ud — uf — uf , 
where A,,=0F;,/du; +(0F,/duj )(Ouj /dus) and hk and k are umbral 
indices with range 1, 2. The argumenis of Ax, are (m, ui +0(Ui —u/), 
ta, us —ud), us, us, Us, ui —uj), x), where 0<6<1. It 
should be noted that in this formula w; is not allowed to vary. 

Since the partial derivative of G with respect to u, determines duj /duy , 

the function £, is given by the formula 


Ex(u,,ui , U1, ,Ug , x) ux )(UK ux ). 


Let us write T=Uj and V=Uj —uj. We may then 
by the help of (14) state the following theorem: 


1928] GENERALIZED LAGRANGE PROBLEMS 375 


THEOREM 6. If the extremal Ex, makes I, a maximum when u; is not al- 
lowed to vary, and at the same time makes I, a maximum when u, and uz are 
not allowed to vary, it is necessary that the quadratic differential forms 


(16) | + + | + |, 


(17) | 


be definite negative forms for all systems of finite values of u{, ui, ug and uj, 
when the point (tu, Ue, Us, %s, x) remains in the domain R.T 


In (16) us is not allowed to vary and in (17) # and wu, are not allowed to 
vary. In particular if we let U) approach wu; in (16) we obtain a condition 
analogous to the Legendre condition. 

8. The analogue of the Jacobi condition. We proceed now to determine 
an analogue of the Jacobi condition for the problem of simultaneous maxima. 
Let and Eo; be two extremals of a two-parameter family, us; =constant, 
u4(Xo) =U4o, through the point 0, and suppose that these extremals touch an 
envelope WN of the family at their end points 2 and 3. Since the differential 
dI, of §6 is a total differential, us being constant, the integral /,* = fdl, 
around a closed contour C is zero. As already pointed out in §6, /,* along an 
extremal is identically equal to J:, hence 


The differentials dx, du;,i=1,---,4, at a point of the envelope satisfy 
the equations du;=p,dx with the slope p of the extremal tangent to W at 
that point. It follows then that J,*(N23) is the same as J,(N2s); hence the 
following theorem: 


THEOREM 7. If Eo2 and Ev; are two members of a two-parameter family, 
us =constant, of extremals through the fixed point 0, and if these touch an envelope 
N of the family at their end points 2 and 3, then the values of the integral I; 
along the arcs Ec, Eos and Ns satisfy the equation =11(Eos) 
for every position of the point 3 preceding 2 on N. 


This theorem is the analogue of the envelope theorem of the calculus of 
variations.§ We have also a similar theorem for the integral J2, for u and tu 
held constant. 


t For definition of definite negative forms see M. Bécher, Introduction to Higher Algebra, p. 150. 
¢t Hadamard, loc. cit., p. 391. 
§ Bliss, loc. cit., pp. 140-141. 


= 
Fy 


376 C. F. ROOS [April 


Consider now the value of J/ (0) =6/,(0) given by equation (10B) for 
all end values fixed. Let ui,---, us, ud, x) and ut,---, 
tus, ud , x) be, respectively, the coefficients of 6f, and 5f/ in this expression. 
In order to simplify the notation further let 6f,=&. We can then write 
(10B) as 


I{ (0) = f ‘Dake + 


where & is umbral and ranges over 1 and 2 only. By a differentiation of this 
expression we obtain 


I{"(0) = f + Ex + 


+ /dus)EL + EL bul |dx, 


since £3 is zero by hypothesis. The variations 5u, and 5u{ are to be determined 
by the equations of partial variation of §5. We could carry out the 
substitution and application of Dirichlet’s formula as before, and obtain an 
equation analogous to Jacobi’s differential equation.* As one can readily 
see, this equation would be functional-differential in form, and would, there- 
fore, be extremely difficult to handle. Instead of attempting to derive 
Jacobi’s condition rigorously by means of this equation, we shall content 
ourselves with using a simpler and less rigorous method.f 

According to the last theorem the value of J; along the composite arc 
Eos+Nx2+ Ex is always the same as its value along Eg. Since Nz is not an 
extremal, it can be replaced by an arc Cy giving J; a larger value, and hence 
I,(Eu) cannot be a minimum.{ As a further necessary condition we must, 
therefore, demand that there be no point 2 conjugate to 0 between 0 and 1 
on a maximizing arc E» which is an extremal, with a condition analogous to 
the condition 0?F,/dy’dy’ ~0 everywhere on it. 


IV. SUFFICIENT CONDITIONS FOR SIMULTANEOUS MAXIMA 


9. Relative strong and weak maxima. By definition an extremal curve 
Em furnishes a strong relative maximum for an integral J, when wu; is not 
allowed to vary, if there exist a positive number ¢ such that the integral 


* Bliss, loc. cit., p. 163. 

t Bliss, loc. cit., p. 141. 

t To prove this we need to know that the functional-differential equations (9) defining an 
extremal have a unique solution at an arbitrarily selected point and direction. 


GENERALIZED LAGRANGE PROBLEMS 


| 
I, = f Fy (uy, ,x)dx 
z 


is greater than the integral 


I(w) = f + wi(x), wf + wi (x), + wa(x), ud + wi (x), x]dx 


for all possible forms of the functions w,,¢=1,--- , 4, of class (I) in the 
interval x» <x <%, and satisfying the conditions* 


(18) we(xo) = 0; | mS 


When in addition to the conditions above the functions w,(x) satisfy 
| (x) |<e for an extremal curve we, us, us, x) furnishes a 
weak relative maximum.t} 

10. Sufficient conditions for a maximum. By means of the definition of 
an extremal field of §6 and the definitions of strong and weak relative maxima 
of §9 we are now in a position to write sufficient conditions for both strong 
and weak relative maxima. 


THEOREM 8. If Eo: is an extremal arc and if the conditions (14) and (15) 
without the equality sign are satisfied at every element (tu, ui ,--- , Us, Us , x) 
in a neighborhood R', contained in R, of the corresponding elements of Eu 
for every admissible set (u;, Ui, --- , us, Ud , x) such that in (14) the expressions 
(Ui and (Ud are not both zero and yet Uj =0, and, further- 
more, in (15) (Us ) is not zero and yet U{ =0 and UZ =0, ard 
if, finally, there is no point 2 conjugate to 0 between 0 and 1 on En, then I,(Eu) 
is a strong relative maximum when uz is not allowed to vary, and I.(Em) is a 
strong relative maximum when u, and uz are not allowed to vary. 


The conditions for a weak relative maximum do not require that the 
Weierstrass conditions be satisfied. 


THEOREM 9. If Eq: is an extremal arc and if the Legendre conditions (16) 
and (17) without the equality sign are satisfied at every set of values (um, ui, 

- , U4, Us , x) on this arc, and if there is no point 2 conjugate to 0 between 
0 and 1 on Eu, then I,(Eq) is a weak relative maximum when us is not allowed 
to vary, and I.(Eq) is a weak relative maximum when u, and uz are not allowed to 
vary. 


* If w(x) is a continuous function admitting a continuous derivative in x»SxS%, we shall 
say that it belongs to the class (I) in the interval (xox). See E. Goursat, Cours d’Analyse Mathé- 
matique, vol. 3, p. 547. 

t These are the classical definitions given by Goursat, loc. cit., pp. 612-613. 


377 
1928] 


378 C. F. ROOS [April 


Although the above sufficient conditions apply strictly to the generalized 
Lagrange problem, by a slight modification they can be made to apply to 
the classical problem where only one integral J; is considered. Thus, allowing 
us to vary in F, requires that the subscript & in (14) and (15) take on the 
values 1, 2, 3 instead of 1 and 2 only. The arguments of F, and G in all of 
the relations must of course be changed so that U# is given a proper place. 
In as much as this change will be obvious to the careful reader no attempt 
to write the corresponding conditions for the classical problem will be made 
here. 


The sufficiency theorems for the general problem when one end point is 
variable differ from those just given in that the transversality conditions must be 
adjoined.* 


V. INTEGRAL EQUATION TREATMENT OF THE PROBLEM OF LAGRANGE FOR 
MORE THAN ONE DIFFERENTIAL EQUATION 


11. Equations of variation. It is readily seen that the analysis of the 
preceding chapter applies to the problem of Lagrange for one differential 
equation. By introducing the theory of Volterra integral equations} this 
analysis can be modified to apply to the Lagrange problem for more than one 
differential relation and to the more general problem for which the differential 
relation is replaced by a functional relation of the type referred to in the 
introduction. Since the method employed in solving the problem for two 
differential equations is perfectly general, we need only discuss this case. 

Our problem is to determine, through two fixed points 0 and 1 in the 
hyperspace (11, 2, us, x), a curve Eo; which satisfies two differential equations 
Gi(ts, , Ue, Us , Us, Us U4, Us , X) =O, R=1, 2, and which furnishes a maxi- 
mum for an integral 


| 
I= f ,U2, Ue Uz , Ug ,x)dx. 
Zo 


We assume the G; to be functionally independent, i.e. 


9G2/dug 
in the region under consideration, and to possess continuous second-order 
partial derivatives with respect to Ud, Us, Us , U4, Us, x. Although 


* Bliss, loc. cit., pp. 169-170. 
t For the theory of Volterra integral equations see V. Volterra, Legons sur les Equations Intégrales. 


1928] GENERALIZED LAGRANGE PROBLEMS 379 


we have chosen both ends fixed, it is not necessary to make this assumption, 
as will appear presently. Let us first consider the problem stated above for 
both end points fixed. 

Let the maximizing curve Ey, if such a curve exist, be the one defined by 
the equations 


Uy = 

us = y(x) 
and write 

us = y + O(x, a) ; Up = Zp + fp(x,a) (p = 

where @ and f, are functions continuous with their second derivatives with 
respect to x and a, and which vanish when a vanishes. This notation for the 
u;, 7=1, 2, 3, is used to indicate that u; is to be regarded as the function whose 
variation is independent. We assume the variations of and uz to be de- 


termined by the end values (xo) =%0, ue(%o)=weo and the differential 
equations of total variation 


p=1 


(19) + + + = 0 


(k = 1,2) 


obtained by substituting the values of u; given above in the differential 
equations G; =0, differentiating with respect to a and then setting a=0. As 
we showed in §3 this is consistent with the assumption that 0 is a fixed end 
point. 

If these same values of u; be substituted in F, the integral J becomes a 
function of the parameter a and yields on differentiation with respect to 
this parameter 


(1/da)éa = f + + (AF /dz,)5f, + (OF /dz; df} |dx, 


where 7 is an umbral index with range 1, 2 corresponding to p with range 1, 2. 

12. Dependent variations by the theory of Volterra integral equations. 
In the classical treatment of this problem, Lagrange multipliers are intro- 
duced at this stage, but they can be advantageously avoided by integrating 
equation (19) with respect to x between the limits x) and x, where x» Sx S41. 
If we perform this integration, replace x under the integral sign by s, and 
then perform an integration by parts on the terms which involve the primed 
variations, we obtain 


9 

| 

| 


380 C. F. ROOS 
d 


z d 
+ f ds = 0, 


for, since 60 vanishes at xo, the 5f, must also vanish if the determinant AG 
is not zero in x» Sx <%;. We taker umbral as before. 

The variations 6f, are then determined by the system of Volterra integral 
equations 


(21) = + f (r = 1,2), 


where by definition 


d 


d 


where / and # are umbral indices having ranges 1, 2, and A;, is the cofactor 
of the corresponding element of AG divided by —AG. 

These integral equations form a Volterra system of the second type for 
the determination of the 6f,, uniquely, if the kernels K,,(x, s) are finite and 
integrable in the interval x»<s<x*<%.* 

If AG0 on the range x»Ss<x<4%, the K,,(x, s) will be finite and in- 
tegrable on the range because of the continuity requirements on the G,. 
The unique solution of the system is, therefore, 


(22) = 6a) + f 


where # is umbral with range 1, 2 and S,,(x, s) is the resolvent kernel of 
K,,(x, s) defined by the equations 


Ki, (x,s) K,,(x,5), 


Ki,(z,s) = f s)dt, 


Srp(%, 5) Ki, (x,s) 
t=1 


* Volterra, loc. cit., p. 71. 


1928] GENERALIZED LAGRANGE PROBLEMS 381 


If we substitute in (22) the values of ¢,(x) and ¢@,(s) as given by their defini- 
tions, then apply Dirichlet’s formula to the iterated integral of the result, 
and then interchange the parameters of integration, we may write the var- 
iation of 5f, as 
(23) 5f,(x) = W,(x)60 + f V(x,s)60ds, 
where by definition : 

= 


d 
d z 
ds 2 


and where / and p are umbral indices with range 1, 2. 
By differentiation with respect to x we obtain 


d z 
(24) df! (x) = + + f ds. 
dx 


13. Eulerian equations. A substitution of 6f,(x) and 6f/ (x) in the first 
variation of I followed by an application of Dirichlet’s formula as before 
yields 


= f + + + T(x) dx 


+ + )d(W,60)/dx|dx, 


where 


T(x) = f 2) + (aF/as! )(aV;(s, x)/as) ]ds. 


An integration by parts performed on the primed terms yields 


= f [aF/day + (@F/d2z,)W, + x)(@F/az!) + T(x) — 
z x 


d 
— W,—(aF/az! ) ]80 dx . 
dx 


In order that 67 vanish for all 56 it is necessary that the coefficient of 50 
in the above integral vanish and hence 


382 Cc. F. ROOS 
d 

(25) OF/dy + (0F/d2z,)W, + V.(x,x) (OF/dz/) + T(x) 
x 


d 
— W,—(0F/dz;) = 0, 
dx 


where r is an umbral index with range 1, 2. 

It is well to note that the function T(x) is an integral involving the re- 
solvent kernel of the system of Volterra integral equations defining the 
variations. If some variable other than u; had been chosen to be independent, 
a different set of conditions of the type (25) would result, but presumably 
the new set would be equivalent to (25). 

If one or both end points were variable, the problem could still be 
treated by the methods of this paragraph. For this case equation (20) would 
contain terms in 60(x9), 50(x1), 5f-(xo), 5f-(1). These terms could be carried 
all the way through the analysis and would yield transversality conditions 
in a new form. This interesting problem will not be attacked in the present 


paper. 
VI. FURTHER GENERALIZATIONS 


14. Problem for functional relations. A special problem in which a 
linear integral equation replaces the first-order differential equation G=0 
has already been considered.* We desire now to consider the more general 
problem of determining a curve Eo; of the space (#1, ue, us, x) satisfying func- 
tional relations 


(26) 


z 
= f Py(u1, Ui ,U2, Uz ,x,S)ds 


such that an integral 
[= f F(uy, ,U2,Ue ,U3,U3 ,x)dx 


is a maximum. We may suppose the end parameter 2 and the corresponding 
end values u;(xo), i=1, 2, 3, to be fixed, although this is not necessary. 
Let us, for the sake of brevity, also suppose x; and the corresponding end 
values of the u; to be fixed. 

If the u; are replaced by functions satisfying the same conditions as the 
corresponding functions of §11, the functional equations (26) become relations 
involving the parameter a and yield by parametric differentiation 


* A mathematical theory of competition, loc. cit., p. 173. 


(k=1,2), 


GENERALIZED LAGRANGE PROBLEMS 
(8G,/ay)80 + + Jaf! 
f [(aP./ay)80 + 


+ + |ds (k = 1,2), 


where r is umbral with range 1, 2. 
An integration with respect to x followed by an integration by parts on 
the primed variations yields 


Ss 
d 


z 8 d 
+( (aP,/ae! ))ar Ja, 


where for convenience in notation the parameter of integration x has been 
changed to s. 

If we apply Dirichlet’s formula to the iterated integral, we can write thi 
expression as 


(8G,/a2! = — — f 


z 


d 
AY 


— s)(aPs/ay 50 ds 


d 
Zo 


d 


As far as the variations of 6f, and 60 are concerned this expression is of 
the same form as (21), if G,/dz,/ is not zero on the range x»SsSxS%;. The 
analysis of the preceding section, therefore, applies from this point. 


384 C. F. ROOS 


15. Further extensions. The extension to the case of more than one in- 
dependent variable is obtained by placing a subscript on y in the above 
equations and regarding this subscript as an umbral index of the proper range. 

The problem of simultaneous maxima for more than one differential or 
integral relation can be treated by this same method, since, if there are two 
integrals and two inde endent variables us and #,, equation (25), with the 
proper arguments for Fi, F, and G, substituted, is a necessary condition 
that a curve Ey; in the space (w#, we, us, M4, x) satisfy differential equations 
ul,- ++, Us, ud, x) =0, R=1, 2, and make an integral 


a maximum when % is not allowed to vary. 

A discussion of the corresponding problems for variable end points should 
lead to analogues of the Weierstrass and Legendre necessary conditions and 
to sufficient conditions for strong and weak relative maxima. 

The assumption that the admissible arcs have continuously turning tan- 
gents is by no means necessary. If we make assumptions on admissible arcs 
similar to those of Part II, we can obtain the analogues of the more general 
Euler equations (7) and (8), by integrating by parts the terms in 50 instead of 
those in 50’. The analogues of the Weierstrass-Erdmann corner conditions* 


follow readily from the Euler equations in the form (7) and (8). 


* Bliss, loc. cit., p. 143, and O. Bolza, Vorlesungen tiber Variationsrechnung, 1909, p. 366. 


Rice INsTITUTE, 
Houston, TEXAS 


ON JACOBI’S ARITHMETICAL THEOREMS CONCERNING 
THE SIMULTANEOUS REPRESENTATION OF 
NUMBERS BY TWO DIFFERENT 
QUADRATIC FORMS* 


BY 
J. V. USPENSKY 


Part I 


The infinite series and products afforded by the theory of elliptic functions 
lead in a natural and easy way to the discovery of many peculiar arith- 
metical theorems which, at first sight, do not seem to be easily obtainable by 
purely arithmetical methods. But deeper insight into arithmetical properties 
of elliptic series and products shows that the use of them may be superseded 
by certain arithmetical relations of a very general nature. Liouville was the 
first to call attention to this fact, but he never attempted to give a complete 
and systematic account of his ideas, and for this reason they did not attract 
the attention they deserve. The author of this paper, by his personal investi- 
gations concerning this subject, has been led to the conclusion that all the 
results previously obtained by means of elliptic functions may be as well 
established by purely arithmetical methods of extremely elementary nature, 
and he published his investigations, so far as it was possible under the 
circumstances, in a series of papers.f It is his intention to show in this 
paper how the arithmetical theorems obtained by Jacobi in his memoir 
Ueber diejenigen unendlichen Reihen, deren Exponenten zugleich in zwei 
verschiedenen quadratischen Formen enthalten sindt can be easily derived by 
very simple arithmetical considerations. The paper is divided into two parts, 
according to two different methods of treatment for questions of this kind. 
The method developed in the first part is based on Liouville’s ideas, but, 


* Presented to the Society, December 31, 1926; received by the editors January 12, 1927. 

+ J. Uspensky, On arithmetical theorems given by Stieltjes (in Russian), Bulletin of the Mathe- 
matical Society in Kharkov, vol. 13 (1912); On the representation of numbers by sums of squares (in 
Russian), ibid., vol. 13 (1912); On certain arithmetic theorems (in Russian), ibid., vol. 13 (1913); 
On the possibility of representation of primes by certain quadratic forms (in Russian), Bulletin of the 
Mathematical Society in Kazan, 1915; On the representation of numbers by the quadratic forms with 
4 and 6 variables (in- Russian), Bulletin of the Mathematical Society in Kharkov, vol. 16 (1916); 
Sur les relations entre les nombres des classes des formes quadratiques binaires et positives, Parts 1, 2, 3, 
4, 5, Bulletin of the Academy of Sciences of the U. S. S. R., 1925-26; Note sur le nombre de rep- 
résentations des nombres par une somme d’un nombre pair de carrés, ibid., 1925. 

t Jacobi’s Gesammelte Werke, vol. 3, p. 220. 


385 


386 J. V. USPENSKY [April 


though simple and elegant, it is not the best adapted to this subject. Such 
a method, meeting all the requirements of simplicity and efficacy, will be 
developed in the second part. 

1. The fundamental identity and its consequences. Instead of using var- 
ious properties and transformations of elliptic series and products, we shall 
take for our starting point a single very general formula which, in a certain 
sense, contains the very essence of the arithmetical properties of elliptic 
functions. Let F(x, y, z) represent a function defined for integral values 
of the variables x, y, z and satisfying the conditions 


(1) F(— x, y, s) = — F(x, y, 2), F(x, — y, —2) = F(x, y, 2), 

F(0O, y, z) = 0. 
For any such function and for any positive integer n, the following funda- 
mental relation holds: 
(2) — 2i,d+ i, 2d+2i—56) =T, 
where both sums extend over all the solutions of the indeterminate equation 
(3) n= 1° + dé 
in integers i, d, 6, the two last being supposed positive, and J =0 when n 
is not a perfect square, 


2s—1 


T= [2F(2s —j,s, 2s —j) —F(2s, s —j, 2s — 2j)] = s > 0. 


j=l 


In order to avoid lengthy explanations, it is useful to introduce the symbolic 
notation {A} for a quantity related to an integer m, having value 0 except 
when m is a perfect square: »=s*, s>0, in which case the above symbol 
represents the number A. With this notation adopted, the fundamental 
equation (2) may be written as follows: 

2 SO [F(6 — 2i, d+ i, 2d + 2i — 6) — F(d +6, i, d — 8)] 


(a) 


(2 bis) men 

3, s, 2s —j) —F(2s,s—j, 2s — 2 |}, n= s*,s>0, 
(a) n = it + 

The proof of this important equation is very easy, and the reader can find 
it for instance in the first part of our memoir Sur les relations entre les nombres 
des classes des formes quadratiques binaires et positives.T 


* In this formula (a) n=i?+d45 indicates the range of summation for a A similar notation 


is used throughout. (a) 
t Loc. cit. 


1928] ARITHMETICAL THEOREMS OF JACOBI 387 


From the fundamental equation (2), by submitting the highly arbitrary 
function F(x, y, z) to certain restrictions, many other important relations 
may be derived. In this paper, however, we confine ourselves only to 
such of these relations as are absolutely necessary for our purpose. Denoting 
by f(x) an arbitrary odd function of a single variable x, we can first take, in 
the fundamental equation (2), 


F(x, y,z) =Oforevenx, F(x, y, 2) = f(x) for odd x, 


which, after a simple discussion, leads to the following relation: 


(4) +8) = — 21) - { 


(a) () (c) 
(a2) + 50dd; (6) m= 50dd; 
(c) j=1,3,5,---,2s—1, n=s%, s>0. 


Again, assuming in the same fundamental equation 
F(x, y, 2) = 0 for even x, 
F(x, 2) (— 1) (s+#)/2+ f(y) for odd x, 


we arrive at the very useful formula 


(5) 1)if(d + i) = {(— 1)*sf(s)}, n = s?, s > 0, 
(a) 
(a) ' n = i? + dé, 5 odd. 


Suppose now divisible by 4, so that »=4m. By taking first for F(z, y, z) 
the function defined as follows: 

F(x, y, 2) = 0, whenever x is odd, 

F(x, y, 2) = 0, whenever y is even, 

F(x, y, 2) = o(%/4, y), otherwise, 
where ¢(x, y) denotes an arbitrary function odd with respect to x and even 


with respect to y, we have after a simple discussion and with a slightly 
changed notation 


(6) + 4)/4, = + i, — 2%) + {2 at, 


(a) (b) (c) 
(a) 4m dd, odd; (b) m= i? + 6 odd; 
() j =1,3,5,--+,2s—1, m=s?, s>0. 
Taking again 
F(x, y, 2) = 0, whenever x is odd, 
F(x, y, z) = 0, whenever y is even, 


F(x, y, 2) = (— y), otherwise, 


388 J. V. USPENSKY [April 


we get the equation 
(7) 1) 0-1) /2+ (4-8-2) + 6)/4, d] = 2 1)4o(d + i, 6 — 2i) 


(a) (b) 
+{2D06, a}, 
(ce) 
(a) 4m + dé, d odd; (6) m= i2+ dd, 5 odd; 
m=s*,s>0, 


the left member of which is obviously equal to 0. Subtracting now (7) 
from (6) we have finally 


(8) Dol(d + 4)/4, = 4 + i, 6 — 24), 


(a) (b) 
4m = d*+ dé, d odd; (b) m= #-+ dé, d and é odd. 


The last equation we need may be obtained as follows. First we suppose 
divisible by 3, so that » =3m, and then define F(x, y, z) as follows: 

F(x, y, 2) = 0, whenever y + z is not divisible by 3, 

F(x, y, z) = 0, whenever y is divisible by 3, 

F(x, y, 2) = f(x/3) otherwise, 
f(x) being an arbitrary odd function of x. It is obvious that this definition 


is consistent with the fundamental conditions (1). The resulting equation 
with a slightly modified notation may be written as follows: 


(9) [(4 + = 2 — + {4sf(2s)}, m = 3s?, s > 0, 


(a) 
3m=h?+ AA’, h+A—A’=0 (mod 3); (5) m= ds, d (mod 3), 


and admits of a further simplification. As h is supposed to be non-divisible 
by 3 in the equation 3m =h?+ AQ’, it is easy to see that, whenever h+A—A’ 
is divisible by 3, —4+A—A’ is not divisible by 3, and, whenever h+A—A’ 
is not divisible by 3, —h+-A—A’ is divisible by 3, whence it follows that 


Dsl + 47/3) = 4/3], 


(a) 
(mod 3); (5) (mod 3), 


or 


2 + 47/3) = 4/3], 


(a) 
(a) (mod 3); m=h?+AA’,h x0 (mod 3). 


1928} ARITHMETICAL THEOREMS OF JACOBI 389 


The left hand member of (9) being thus simplified, we finally get the impor- 
tant relation 


(10) +.49/3] = 4 — 24) + {85f(2s)}, m = s > 0, 


(a) (6) 
3m=h*+Aa', h¥#0 (mod 3); (b) 3m=3i+45 dx 0 (mod 3). 


2. Number of representations by certain quadratic forms. In the follow- 
ing discussion we need to use the well known results concerning the number 
of representations by quadratic forms x?+~y?, x?+2y? and x*+3y?. The 
discussion would be considerably abbreviated should we simply recall these 
results, but, from a methodical point of view, it is of interest to derive all 
the auxiliary propositions from one and the same source. 

Beginning with the quadratic form x*?+-y?, let us consider the equation 
n=x?-+-?, and denote by N(m) the number of its solutions. The following 
two equations are obvious: 


N(4n) = N(n), 
N(n) = 0, when » = 3 (mod 4). 


Furthermore, denoting by m any odd number, let us consider the equation 


2m = x? + y? + 2%. 
We do not assume that this equation is necessarily solvable, but, if it is, 
among the numbers 2, y, z there are two odd and one even. The number of 
solutions with an odd x is, therefore, twice as great as the number of solutions 
with an even x, which leads to the recurrent relation 


(2m — x*) = 2 ON(2m — 4x*), 
(a) (b) 
(a) a= +1, + 3, £5, (5) x=0, +1, £2,-°°, 


both series being continued so far as the arguments remain positive. The 
same relation may be also written in the form 


(11) (2m — x*) + SON (2m — 4x*) = 0, 


(a) (a) 
(a) x =0, +1, +2,-->. 


If there are no representations of 2m by the sum of three squares (which is 
really impossible), both the sums in (11) are equal to 0, and (11) continues 
to hold. 

Considering the equation m=x*+-?+2*, and supposing first m=1 


390 J. V. USPENSKY [April 


(mod 4), we easily see that the number of solutions with an even x is twice 
as great as the number of solutions with an odd x, so that 

— 4x”) = 2 — x?), 

(a) (b) 
(a) z=0,+1, +2, (0) s@ 


or, what is the same thing, 
(12) LN (m — x*) = 1)*N(m — 


(a) (b) 
(=) x=+1,+3, +5,---; *«=0,+1,+2,---, 


both series being continued as long as the arguments remain non-negative. 
In the case m =3 (mod 4) we have 


(13) — x?) = 1)*N(m — x?), 


(a) 
(a) +++; 


because N(m—4x*)=0 for every x. The equations (12) and (13) can be 
condensed into one, 
(14) (— N(m — x?) = — 2°), 
(a) (6) 
(a) ee (0) 2=0,+1,+2,---, 
which thus holds for every odd m. Now it is possible to find a priori a nu- 
merical function satisfying the same equations (11) and (14). For this pur- 
pose we take 
f(x) = (— 

in the relation (4); the resulting equation 

1) = — {3(1 +(—1))},2=s?,s>0, 

(a) 
(a) n = i? + 2dé, 5 odd; (b) n = i? + dé, 6 odd, 


introducing the numerical function p(m) defined by 


p(n) = 1)@-Y/?, n = dé, 6 odd ; p(0) = 


leads, after a simple discussion, to the following relations: 
(15) 1)%o(2m — + Dip(2m — 4x*) = 0, 


(a) (6) 
(a) z=0,+1,+2,---, (0) 


(= — 22) = 1)*(m — 24), 


(a) () 


1928} ARITHMETICAL THEOREMS OF JACOBI 391 


perfectly analogous to (11) and (14). Putting w(n) =N(n) —4p(m), we have 
therefore 
1)w(2m — x*) + w(2m — 42%) = 0, 
(e) 
(d) +1, 22,---;3 


1)*w(m — 2?) = (— w(m — 2%), 
(a) (b) 
(a) z=0,+1, +2,--:; (b) +1,+3,25,--:, 


or 


(16) w(2m) + 1)*w(2m — x?) + w(2m — 4x?) = 0, 
(a) 
(a) x= 1,2,3,++°; (0) = 1, 2,3,-°°; 


(17) w(m) + 2 — x) — 2(— w(m — x7) =0, 


(a) 
(2) = 1,2,3,-+-; (d) z= 1,3,5,-+*. 


Moreover, it follows from the definition of p(m) that p(4m) =p(m), so that 
for every ” 


(18) w(4n) = w(n). 


Now w(0) =0 and we can establish that w(m)=0 for every m by means of 


mathematical induction. For, supposing the equation w(n) =0 established 
for allm <N, we can prove that w(NV) =0. First, if N is divisible by 4, we have, 
by (18), w(V) =w(N/4) =0. If N=2m, where m is supposed to be odd, it 
follows from (16) that w(V)=0. Finally, for an odd N the same conclusion 
follows from (17). The equation w(m) =0 being thus established, we have 
reached at the same time the well known result 


N(n) = 4p(n) 


by means of extremely simple considerations. 

The number of representations by the form x?+2y? may be obtained in a 
very similar way. Denoting again by N(m) the number of solutions of the 
equation ”=x?+2y?, we have first the obvious equation N(2n) = N(n), and 
the following equation can be very easily verified: 


N(n) = 0 when = (mod 8). 


Consider now the equation 


m = 4x? + y? + 22?, 


392 J. V. USPENSKY [April 


where m is supposed to be odd; the number of its solutions is given either by 
the sum 


(m — 4x?) (c=0, +1, +2,---), 


or by the sum 


— =+1, +3, +5,---), 


so that we have 


(19) (m — 4x*) = — 
(a) (b) 
(a) (b) +1, + 3, 


On the other hand, the function ¢(m) defined by 


¢(0) = 2» 


satisfies the equation of the same form. To show this we take f(x) =sin(rx/4) 
in the relation (4). Noticing that for any odd number x 


1) /2+(2?—1) /8 
sin — = —-(— 1) 


it is easy to get the following two equations: for m=1(mod 4), 


(— 1)(™-D/4 — x*) = 1)*6(m — 42%), 


(a) (6) 
(a) (0) z= 0, +1, 


and for m=3 (mod 4), 
(— 1)(™-9/4 — x*) = — 424), 


(a) (b) 
(a) (b) + 1,4 2,-->. 


But as ¢(n) =0 when n=5 or 7 (mod 8), it is easy to see that 
[(— 1)(™-D/4+2 — 1]¢(m — 4x2) = 0, when m = 1 (mod 4), 
[(— 1)(™-9)/4+2 — 1]¢(m — 4x*) = 0, when m = 3 (mod 4), 


whence it follows that the two preceding equations may be combined into 
one: 

— x*) = — 42%), 

(a) (b) 
(a) +1,+ 3, + (b) z=0,+ 1, + 


1928} ARITHMETICAL THEOREMS OF JACOBI 393 


which is perfectly analogous to (19). Putting, therefore, w(m) = N(m) —2¢(m), 
we have 
(20) w(m) + 2 w(m — 4x*)=2 w(m — x’), 
(a) (b) 
(a) z= 1, 2,3,°°°; (d) 


and obviously w(2m)=w(n). Now it is easy to show that, for every n, 
w(n)=0. For, suppose this already verified for alla<N. If N be an even 
number, then w(N)=w(N/2)=0. But if N be an odd number, the same 
conclusion follows from (20). As w(0)=0, we have w(m)=0 for every n, 
that is, N(m)=2¢(m). It remains to determine by means of analogous con- 
siderations the number of representations by the form x?+3y*. Denoting 
again by N(m) the number of solutions of the equation n=x*+3y?, the 
following properties of N(m) are almost evident. First, 


N(3n) = N(n), 

N(n) = 0 when »=2 (mod 3), 
and furthermore, if m denotes an odd number, 

N(2¢m) = 0 for an odd a, 

N(2%m) = N(Am) for an even a. 


That the equation N(4m) =3N(m) holds true for every odd m, is not so 
evident, but still can be easily proved. It implies that the number of solutions 
of the equation 4m =x*?+-3y? in odd numbers 2, y is twice as great as the num- 
ber of all solutions of the equation m=x?+3y*. Consider now the equation 


4n = + + 


where \ and v are supposed to be odd. The number of its solutions, on the 
one hand, is given by 


N(4n — d?*) A= +1, +3, 45,---); 


on the other hand it follows from the preceding remark that the same number 
is given by the sum 2).N(n—x*) extended over all integers x such that 
n—«* is positive and odd. Thus we get the important equation 

(21) (4n — = 2 ON (n — — odd. 


We shall show now that the function x(m) defined by 


x(n) = x(=). n = dé, 6 odd, 


394 J. V. USPENSKY [April 


satisfies exactly the same equation for every m non-divisible by 3. To this 
end we take in the relation (8) 
x, = co 
3 

which leads to the following equation: 

2rd 276 27d 2rd 276 
(22) sin cos —— cos —- = 2 > sin cos cos 

(a) 3 (b) 3 3 

(a) 4n = 2? + dé, d odd; (b) n = i? + dé, d and 6 odd. 


In order to exhibit it in the simplest possible form we notice first that 


1 if a = 0 (mod 3), 


1 
— — if @ # 0 (mod 3), 


and that for every odd number 


_ ~(=) 
sin —- = —-+{ —— }. 
3 2 x 


Profiting by these remarks we find for n=1 (mod 3) 
276 27d 1 276 


cos cos = — —» = — 


3 2 3 3 2 
so that equation (22) in this case may be written as follows: 
> x(4n — = 2>0x(n — — odd. 
For n=2 (mod 3) we have 
276 1 2 
cos —- cos —- = — —or+—, =O0or = + 1 (mod 3), 
3 3 2 4 
276 2ri 1 1 


cos —- cos —- = — —or+—, if i = = + 1 (mod 3), 
3 3 2 4 


and equation (22) becomes 


— — +4 — 2) = — 2 + 


(a) (c) (d) 
(a) 0; (b) A= +1; (c) i= 0; (d) 4 = + 1 (mod 3). 


But for n=2 (mod 3) it is easy to show that x(m) =0, whence it follows that 
the preceding equation may be also written in the form 


cos 
3 
cos —- = 
3 
1 


ARITHMETICAL THEOREMS OF JACOBI 


— = 2 x(n — — # odd. 


Thus, for every m non-divisible by 3 we have 


— 2) = 2 x(n — 7”), n — odd, 
and, putting w(m)=N(m)—2x(m) for an odd m, we reach the analegous 
equation for w(n), 
(23) w(4n — = 2 Dow(n — — x? odd, 


provided n= +1 (mod 3). For small odd values of m we find w(m) =0 and 
to prove this in general we shall apply the process of reasoning by induction. 
First we have w(m)=0 whenever m=2 (mod 3). Now we assume as 

already proved that w(m) =0 for all m=3 (mod 4) which are <12k+7 and 
for all m=1 (mod 4) which are smaller than 4k+1. Let us take in the 
equation (23) n=3k+1 and n=3k+2 respectively; in virtue of this sup- 
position all the terms in the second member disappear, and we get 

w(12k + 3) + w(i2k — 5) +---=0, 

+ 7) + w(i2k—1)+---=0. 
By supposition, 

w(12k — 5) =0, w(i2k — 21) =0, --- 

w(12k — 1) = 0, w(i2k — 17) =0,--- 


so that 
w(i2k + 3) = w(4k + 1) = 0, 


w(i2k +7) =0. 


Taking again in equation (23) n=3k+4, we get 
w(i2k + 15) + w(12k+7)+---=0, 
whence 
w(i2k + 15) = w(4k+ 5) = 0. 
Moreover, w(12k+11)=0, and consequently we have w(m)=0 for all 
m =3 (mod 4) which are <12(k+1)+7 and for all m=1 (mod 4) which are 
<4(k+1)+1, that is, the induction is complete and the equation w(m) =0 


established. Thus we have N(m) =2x(m) for any odd m, while N(4m) =6x(m), 
and in general 


N(2¢%m) = 6x(m), when a is even, > 0, 
N(22m) = 0, when a is odd. 


1928] 395 


396 J. V. USPENSKY [April 


3. Gauss-Jacobi Theorem. Taking in equation (5) m=8+3 and 


f(x) = 
4 


ad 
1)'cos— sin — = 0, 
(a) 4 4 

(a) n = i? + dé, 5 odd, 

which is equivalent to 


(a) (b) 
(a) k=0,+1, +2,-°-°; (0) 


p(n) and ¢(n) being taken with the same meaning as above. As $(n—16h?) 
and p[(m—i?)/2] represent respectively the numbers of solutions of the 
equations 


n — 16h? = k? + 212, Lodd > 0, 
n — i? = 2]? + 82, lodd>0O, 


it is obvious that the preceding equation can be also written in the form 


(24) = 1), 


(a) (b) 
(a) n= k? + + (b) n = + 2)? + 873. 
Now let us consider two numerical functions G(m) and g(m) defined for 
every m=1 (mod 8) by 
G(m) = 1)", m= x* + 169%, 
g(m) = m = + By?, 


the sums being extended over all the solutions of the corresponding equations. 
Introducing these functions, the relation (24) can be written as follows: 


LG(8p + 3 — 2) = Dig(8p + 3 — 1, 2,3,---); 
whence it follows necessarily that 
G(8p + 1) = g(8p + 1), 
that is, for every m=1 (mod 8) we have 


yr = era, 


(a) (b) 
(c) m= x2 + 16y*; (6) m=x*+ 8y’, 


we get 


1928] ARITHMETICAL THEOREMS OF JACOBI 397 


and this constitutes one of the theorems given by Jacobi. In the particular 
case when m is a prime number, both of the equations 


m = a* + 160?, m = c? + 8d? 


possess only one solution in positive integers, and it follows from the pre- 
ceding equality that whenever b is even, c is of the form 8/+1, and whenever 
b is odd, c is of the form 8/+3. This particular theorem was first found by 
Gauss and published in his first memoir on biquadratic residues. 

4. Another theorem by Jacobi. Preliminary results. As the second 
example of the use of the same kind of reasoning we choose another more 
complicated theorem given by Jacobi. Certain preliminary propositions 
are, however, to be established first. Taking f(x) =sin (32x/2) in (10), the 
resulting equation may be presented as follows: 


5 
1)¢+@-1)/2 = 2 1) 


(a) (b) 
(a) 3n = kh? + 2d5,h #0 (mod 3), 6 odd; (b) = 317+ di,d #0 (mod 3), 


or else 
(25) 2 — 34) = — 


the last sum being extended over all # which are not divisible by 3 and of 
the same parity as ». Suppose now that m is not divisible by 3. In this 
case for h/ divisible by 3, 


p(3n — h*) = 0, 


because the number 3n—/h? contains the prime number 3 only to the first 
power, and therefore the summation in the right hand member of (25) may 
be extended over all # of the same parity as m. Taking into account the 
arithmetical meaning of the function p(m) the equation (25) leads to the 
relations 


1)* = (1/2)(— 2N(3n = h? + 2k? + 22), 


(a) 


= (1/2)(— 2N(3n = + 2k? + 22), 

(a) 
(a) nm = + + 
for even and odd values of n= +1 (mod 3) respectively. Supposing further 
that »=0 (mod 4) or »=1 (mod 4) and m=1 (mod 3), the two preceding 
equations can be combined into one, namely 

(— 1)! 1)* = 4N(3n = x? + 2y? + 22%), 
(a) 


398 J. V. USPENSKY 


But it is easy to show that 

4N(3n = x? + 2y? + 227) = N(n = x? + 2y? + 627), 
and so finally we get the first preliminary result, 
(26) N(m = x? + 2y? + 62%) = (— 


for every n=1 (mod 12) or = 4 (mod 12). 
Now we put f(x) =sin (27x/3) in the equation (5). Denoting by o; and 
o, the two sums 


n= (- n = 912+ dd, 6 odd, 


1 2nd 
= 1)*sia——» n= i? + dé, 6 odd, 1 0 (mod 3), 


extended over all corresponding representations of m, we have 


31/2 s 
o2= n=s*, s>0. 
2 3 


By means of the known expression for the number of representations by 
the form x?+-3y? it is easy to express o; and a2 as follows. Denote by P, Q, 
R, S, T the numbers of the solutions of the equations 
n = 9i2 + 72+ 3k? withj + k odd, 
n = 9i2 + 3k? withj + even, 
n= i?+ 72+ 3k? withj + k odd and i ¥ 0 (mod 3), 
i? + 7?+ 3k? withj + & even and i ¥ (mod 3), 
i? + 2j? + 6k? ; 


31/2 1 
a= —-(-1) =), 


31/2 1 1 
= ——(— }, 
o2 ) ( 3 ) 


and consequently 


1 1 1 
27 P—-—R —Q--S--—T 
(27) 2 


[April 
then 
3 


ARITHMETICAL THEOREMS OF JACOBI 


T = (— 1)*, = 972 + j? + 3k?, 


(a) (b) (c) 
(a) t=n,j=n; (b) t=n—1j=n; (c) i = n,j =n — 1 (mod 2) 


all the sums being extended over solutions of the equation = 9i?+7?+3k?, 
subject to the limitations indicated below each sign of the sum. For P, Q, 
R and S we find easily analogous expressions 
(a) (b) (c) 
(a) t=n-—1j=n; (b) t=n,j=n-—1; (c) i = n,j = n (mod 2); 


(a) (6) (c) 
(a) i=n,j=zn-—1; (b) t=n—1,j=n; (c) = n,j = n (mod 2). 


Now, substituting all these expressions into (27) we get 


4 {)ite-1 = 2{(=) n=s*,s>0, 


(a) 
(a) t=n—1j=n; (b) 4 =n,j =n — 1 (mod 2), 


or finally 


(28) Y(- pi =(- (=) n=st,s>0, 


(a) 
(a) n = 912 + 72+ 3k?; «+7 odd. 


If, supposing »=1 (mod 4), we put in (5) f(x) =sin (rx/2), the resulting 
equation may be easily presented as follows: 


(29) 1)4 = 2{(- 1) @-0/2s} , #=s*, 
(a) 
(a) n = 1? + 4h? + 4k?, 


5. Proof of Jacobi’s theorem. Denoting by m any odd number we shall 


consider two sums 
x 
3 


extended over all solutions of the equation 4m =x*+3y? in odd numbers x 
and y. All these solutions satisfying an additional condition x=y (mod 4) 


1928] 399 
By (26), 


400 J. V. USPENSKY [April 


can be derived from the solutions of the equation m=£?+3n? by the sub- 
stitution x =§+3n, y=£—n, whence it follows that 


= 22 (=)e-9= (=): 


01 = C2. 


This being established we introduce the sum 
S = =), 
3 


extended over all the solutions of the equation 4n=x?+3(y?+4z?+-4?), 
where x is supposed odd, while m represents any given number. Applying 
(29) we readily get 


x 
S= 4n = x? + 3y*, x odd, 
that is, 0 for m even and oz for m odd. Since oz=0;, we have in all cases 
x 
S = X(=)s. 4n = x? + 3y?, x odd, 


the right hand member being naturally 0 for an even m. On the other hand, , 
we can express S as follows: 


S= — 3y*) (y = 
introducing the function ¥(m) defined by 


= =) _m= x? + 12? + 
for all m=1 (mod 4). Comparing the two expressions for S we get 


— 3y%) = 


(a) (6) \3 
(6) 4n = x? + 3y%, y odd, 


whence it follows that 


y(n) = 24(=)st. n=s*,s>0, 


or 


1928] ARITHMETICAL THEOREMS OF JACOBI 


for every n=1 (mod 4), or 


(a) = + 12y? + 122%, 
The right hand member here being the same as in (28), we have 


1)" = 2 1)**, 


(a) (b) 
(a) n = x2 + Oy? + 1223; (b) n = (6x + 1)? + 12y? + 1223, 


and as this equation is true for every m=1 (mod 12), it follows necessarily 


that 


(a) (b) 
(2) m= x? + (6) m= (6x + 1)? + 123%, 


for every n=1 (mod 12). This is Jacobi’s theorem which we had to prove. 

6. Stephen Smith’s theorem. An interesting theorem of the same kind, 
but involving an indefinite form, has been given by Stephen Smith in his 
Report on the theory of numbers. In order to derive this theorem by our 
elementary methods we need to establish a certain general relation by means 
of a certain device which may be useful on many occasions. Returning to 
the fundamental equation (2) we put 4” instead of m and choose for F(x, y, 2) 
either 

F(x, y, 2) = 0 whenever x is odd or y even, 


F(x, y, 2) = f(«/4, (y + 2)/2) otherwise, 


F(x, y, 2) = 0 whenever x is odd or y even, 
F(x, y, z) = f(x/4, (y — 2)/2) otherwise, 


where f(x, y) is an arbitrary function odd with respect to x and even with 
respect to y. This gives 


(a) 4 2 () () 
(a) 4n = d? + dé, d odd; (b) n = 4? + dd, 5 odd; 
j=1,3,5,--+,2s—1, = 5%, s>0; 


=2 + i, d) + {2 Lis, ot, 


(b) (ec) 


4 2 


(a) 4n = d? + dé, d odd; (b) n = 4? -+ dé, 5 odd 
(o) = 


d+é 


(a) 


401 
30 
‘ or 


402 J. V. USPENSKY [April 


and as the left members are obviously equal, we get 


+ i, 6 -—d — 21) = +i, d) + {sfs, A}, 
(a) (b) 
(a) n = 4? + dé, 5 odd; (0) n = i? + di, 8 odd. 


Now we take m as an odd number and suppose that f(x, y) =0 whenever y 
is odd. In this case the preceding equation becomes 


+ 2h, —d— 4h) = (d+ i, d) + {sf(s, 0}, 
(a) n= yo + dé; pa n = i? + dé, ¢ and 6 odd. 
We specialize this equation further by taking 
I(x, 9) = (— 1)? "F(y/2), 
where F(y) is an arbitrary even function; after evident reductions we obtain 
(31) (2h + {(— 1)¢-sF(0)}, 
(a) 


(a) n = 4h? + dé, 


for every odd number m and even function F(y). It is remarkable, however, 
that this equation holds true for an absolutely arbitrary function F(y) 
if we suppose n=1 (mod 4). Let F(y) be an odd function and F(0) =0; 
being = 1 (mod 4), we evidently have 


d—é6 


d—6 
(2h = 0, 


(a) 
(a) n = 4h? + dé. 


Now let F(y) be an arbitrary function. The equation (31) being satisfied for 
Fi(y) = + F(— »)), 
F,(y) = 4(F(y) — F(— »)), 
it will be satisfied for their sum 
Fi(y) + Fo(y) = F(y). 


Let k denote any number satisfying the condition 4k?<n, where we suppose 


whence 


1928} ARITHMETICAL THEOREMS OF JACOBI 403 


n=1 (mod 4). Taking n—4k? instead of m in (31) and putting ®(x—2k, 2k), 
where ®(x, y) is an arbitrary function, instead of F(x), we get 


+ 2h — 2k, 24) = {(— 2&)}, 


(a) 
n— 4k? 
(a) n — 4k? = 4h? + dé. 
Here we give to k all possible values consistent with the condition 4k*<mn 
and add all the resulting equations; these operations performed, we arrive 
at the equation 


+ 2h — 2k, 24) = 2k), 


(a) (b) 
(a2) m= kh? + do; (+) n=st?+ 4k,s>0, 
which by the substitution y=h+k, z=h—k can be transformed into 
d—6 
+22, y— :) 1)¢-))/2s@(—2k, 2k), 


(a) (b) 
(a) n = 2y? + 222 + di, di = 1 (mod 4); (b) n = s?*+ 4k?,s>0. 


Now, let us define an arbitrary function ®(x, y) as follows: 
@(x, y) = 0 when «x is different from 0, 
y) = f(y), 
where f(y) is an arbitrary function of a single variable. After a simple 
discussion we arrive at the formula 
1) 4+ 2) = {(— = s?, s > 0, 


(a) 


(a) n = x? -+ 2y? — 223 x + 22 > 0, 


where the sum in the left member extends over all representations of 
by the indefinite form x*?+2y?—2z*, the variables being limited by the 
conditions x—2z>0, x+2z>0. Putting f(x)=(—1)*/2F(x), we finally get 
(32) 1)-PAF(y + 2) = {(— = 5 >0, 


(a) 
(a) n = x2? -+ 2y? — 222, x + 22 > 0, 


for n=1 (mod 4) and an absolutely arbitrary function F(x). Taking here 
F(x) =cos (1x/2), we have for n=1 (mod 8) 


= {(— 1) n = s?, 5 >0, 
(a) 
nm = x? + 8y? — 822, x + 42> 0, 


= 

(a) 


404 J. V. USPENSKY 


which is equivalent to 
(33) y(- 1)"g(n — 8y?) = {(- 1) ,#=s*, s>O0, 


(a) 
(a) y¥=0,+1, +2,---, 


where g(m) represents the function defined by 
g(m) = 1) 


the sum being extended over all the representations of m by the indefinite 
form x*—8z*, the variables being limited by the conditions ++4z>0, 
x—4z2>0. 

On the other hand, we have, by (29), 


= 2{(— 1)¢-Y 2s}, = s*, s > 0, 
(a) 

(a) n = x? + + 823, 

that is, 


(34) >(—1)"G(m — 8y*) = {(— = s > 0, 
(a) y=0,+1,+2,---, 


where G(m) is defined by 
G(n) = = x? + 827, x > 0. 
As the left members of (33) and (34) are equal, we have 
— 8y*) = 1)¥G(m — 


for every n=1 (mod 8), whence it follows necessarily that g(m) =G(m) 
for n=1 (mod 8), or 


(a) 
a) n=x?— (b) n = x? + 83, <> 0, 
which constitutes the theorem given by Stephen Smith. 


CARLETON COLLEGE, 
NORTHFIELD, MINNESOTA 


ON RELATIVE CONTENT AND GREEN’S LEMMA* 


BY 
H. L. SMITH 


It has been shown that if the line integtal [,xdy exists over a simple 
closed plane curve C, then the content of K, the interior of C, exists equal 
to that integral. This result may be thought of as the special case of Green’s 
lemma 


(1) Jf = [Peay 


in which P(xy) =x; and it is to be noted that here C does not need to be 
rectifiable. 

In the present paper a definition of relative content is given which makes 
it possible to prove that if P and P, are subject to certain conditions, the 
content of K, relative to a certain non-additive function of rectangles 
derived from P, exists equal to the double integral on the left of (1) and also 
equal to the line integral on the right of (1) whenever that integral exists. 
This result includes as a special case the form of Green’s lemma for rectifiable 
C obtained by Gross,f except that in our result P, is deliberately restricted 
to be properly Riemann integrable instead of summable. In the last section 
sufficient conditions for the existence of the line integral are given which 
yield Green’s lemma for an important case in which C does not need to be 
rectifiable. 

1. Definitions and elementary theorems. Let $ denote a class of par- 
titions II of the rectangle Ro: a<x<b, cS y<d, such that (1) each partition 
II is formed by dividing R, into vertical and horizontal strips; and (2) the 
(greatest) lower bound of the norms of the partitions II of § is zero; here by 
the norm of a partition II of $ is meant the (least) upper bound of the 
lengths of the diagonals of the rectangles of which II consists. 

Moreover let f(R) be a function (not necessarily single-valued) defined 
for every rectangle R: x’ <x<x"’, y’<ysy"’ lying in Ro. Also, if K, and K2 
are any two sets in Ro, let e(K:, K:) =1 if K; and Kz have at least one point 


* Presented to the Society, April 3, 1926; received by the editors December 20, 1926. 

1 H. L. Smith, these Transactions, vol. 27, p. 498. 

¢ Wm. Gross, Monatshefte fiir Mathematik und Physik, vol. 26, p. 70. See also Van Vleck, 
Annals of Mathematics, (2), vol. 22, p. 226; Bray, Annals of Mathematics, (2), vol. 26, p. 278. 


405 


406 H. L. SMITH [April 


in common, and let e(Ki, K:)=0 if K, and Kz have no point in common. 
Finally let AII denote any one of the rectangles of which II consists and let 
NII denote the norm of II. 
Then if K is a set in Ry and 

lim f(All)e(K, AT) 

Nu=0 ‘Au 
exists, it is called the outer content of K relative to f. In the above the 
summation is naturally over all AI. If 

lim f(All)[1 — «(Ro — K, All)] 

40 
exists, it is called the inner content of K relative to f. If both the outer and 
the inner contents of K relative to f exist and are equal, their common value 
is called the content of K relative to f. 

The outer content of K relative to f exists absolutely if it not only exists 
relative to f but also relative to |f|. Similar definitions are given to the 
absolute existence of the inner content and of the content itself. 

A set K is squarable relative to f if its content exists and equals zero; 
it is absolutely squarable relative to f if its content exists absolutely relative 
to f and is zero. 

We mention the following obvious theorem: 


THEOREM 1. If a set K is absolutely squarable every subset of K is absolutely 
squarable. 


We now prove 


THEOREM 2. If the boundary of a set K, entirely interior to Ro, is absolutely 
squarable relative to f, then the content of K exists if either the inner content or 
the outer content exists relative to f. 


For 
AM) = — «(Ro — K, AM)] + TM), 
au Au 
where 
= Dof(Al)e(K, All)e(Ro — K, All). 


But 
| 7(m)| < | f(A1) | AT), 


where K; denotes the boundary of K. Hence limnn_o7 (I) =0, from which 
the theorem follows. 


1928] RELATIVE CONTENT AND GREEN’S LEMMA 407 


THEOREM 3. If K, and K; are both interior to Ro and have no points in 
common and if each has inner content (f) and one of their boundaries is squarable 
absolutely (f), then the inner content (f) of Ki+Kz exists equal to the sum of 
the inner contents (f) of K, and Ke. 


For suppose Ky», the boundary of Ky, is squarable absolutely (f). Then, 
if we set K =K,+ Ko, 


[1 — «(Ro — K, AM)] = [1 — — Ki, AM)] 


+ — e(Ro — Kz, + T(M), 
where 
= Dof(All)e(Ki, All)e(K2, AM)[1 — e(Ro — K, 
But 
| 2, | (a) | (Kip, All). 


Hence limyn_o7 (Il) =0, from which the theorem follows. 


THEOREM 4. If K; and K, are interior to Ry and have no points in common 
and if K, and Kz each have outer content (f) and one of their boundaries is 
squarable. absolutely (f), then the outer content (f) of Kit+Kz exists equal to 
the sum of the outer contents (f) of K, and Kz. 


For suppose Ky, the boundary of K,, is absolutely squarable (f). Then 
Def(All)e(Ki + Ke, AM) = AM) + AM) — T(M), 
40 Au 


where 
T(t) 


f(All)e(Ki, All)e(K2, All). 
But 
| TQ) | S (AT) | (Ki, am). 
All 
Hence limyn-o7 (II) =0; from which the theorem follows. 
Theorems 2, 3 and 4 now give 


THEOREM 5. If K, and Kz are interior to Ry and have no points in common 
and if K, and Kz each have content (f) and one of their boundaries is absolutely 
squarable (f), then the content of Ki+Kz exists (f) and equals the sum of the 
contents (f) of K, and Kz. 


We shall need the following special case of Theorem 5. 


All 40 


408 H. L. SMITH [April 


THEOREM 6. If K is the interior of a simple closed curve C which is interior 
to Ry and K has inner content (f) and C is absolutely squarable (f), then K and 
C+K each have content (f) and their contents are equal. 


2. On the existence of relative content. Let P(xy) and Q(xy) be defined 
on Ry. Then let two (multiply-valued) functions P“(R) and Q‘”(R) be 
defined as follows: 


P®(R) = P(x’ y) — P(x’y), ‘a7 
Q(R) = Oxy”) — Oxy’), x’ Sx 


where R is the rectangle x’<x<x"’, y’<y<y’’, which is assumed to be in 
Ry. In this section we shall be interested in content relative to PQ‘ in 
the special case where Q(xy) =y. 


TuHEeoreEM 7. If P., the first partial derivative of P with respect to x, exists 
on K, the interior of a simple closed squarable curve C in Ro, and if P,is bounded 
and integrable on K, then the inner content of K relative to Py exists 
absolutely and equals dxdy. 


To prove this, note that by the mean value theorem 
>> P (All) y (All) [1 — «(Ro — K, 
= — e(Ro — K, 
where ¢*" is a point (xy) in AII. But ” 
[1 — «(Ro — K, All)] = e(K, All) — T(t), 


where K -AII denotes the set of points common to K and AII and where 


T(Ml) = -All)e(Ro — K, All)e(K, AT) ; 


here ¢*" has already been defined if e(R)—K, ATl) =0 and is defined as any 
point (xy) in K - AIl otherwise. But then 


lim All) = f 
K 


Aili 


and since 
| TU) | De(am)e(C, an), 
Al 


where WN is the least upper bound of |P.| on K, it follows that 
lim T(II) = 0. 


NO 
This proves the theorem. 


All 


1928] RELATIVE CONTENT AND GREEN’S LEMMA 409 


THEOREM 8. If P, exists and is bounded and integrable on Ry and if Cis a 
simple closed squarable curve interior to Ro, then C is absolutely squarable 
(Pw), 


For 
>| (AM) | (C, AM) = | | (AM)(C, AM) 


4u 


N- AM). 


lim >> | | (C, AT) = 0, 


No au 
which is the theorem. 


THEOREM 9. If K is a region bounded by a simple closed squarable curve C 
and P, exists and is bounded and integrable on Ro, then K and K+C each has 
content equal to {x P, dxdy relative to 


This follows from Theorems 6, 7 and 8. 

It has now become necessary to make an additional assumption concern- 
ing %. We say a partition II of Ry is of type (A) if* x (ATI) is constant for 
all ATI of II and if y‘ (ATI) <x (ATI) for every ATI of II. A partition II of 
Rois of type (B) if it can be obtained from a partition of type (A) by subdivid- 
ing some or all of the cells of that partition into at most three parts each by 
means of vertical lines. We assume throughout the remainder of the paper 
that $ consists of all partitions of Ro of type (A) or type (B). 

We are now in a position to prove 


Lemma 1. If C is a simple closed rectifiable curve interior to Ro, then 


(All)e(C, A) S 6-21°C, 
4u 


for every Il with norm sufficiently small, where C denotes the length of C. 
To prove this we note that if r is less than one-half the diameter of C, 
Cw) = 2rC , 


where C,,, denotes the outer content of the set of all points distant by not 
more than r from C. But if also r =2?/!%‘*)(ATI), and II is of type (A), 


* Naturally we are here considering only partitions of R which satisfy condition (1) of §1. 

t This follows from a similar inequality for a simple arc stated by Gross, Monatshefte, vol. 29, 
p. 177. The proof given by Gross is incomplete; a correct proof is to be found in the author’s Chicago 
dissertation. 


|_| All 
Hence 


H. L. SMITH 


(All)e(C, AM) S Coy S 2rC = 2-2'/2x) (ANC. 


Hence since AII = x‘) y‘» (ATI) and is constant, 


YK y(AMe(C, AM) S 2-217C, 
Au 


which proves the result for type (A); from this the result easily follows for 
type (B). 
THEOREM 10. If P is continuous in x uniformly as to (xy) on Ro and C is a 


simple closed rectifiable curve interior to Ro, then C is absolutely squarable 


For 
| y (aM) | €(C, AM) < (aM)c(C, AM) a(M1)6-2""C, 


where a(II) is the largest value of |P‘* (ATI) | for all AIT of II. But on account 
of the uniform continuity, 


lim a(II) = 0, 
Nu 


from which the theorem follows. 

3. On the z-linear extension of P. Its uniform continuity. So far we 
have been considering the function P(xy) as defined on the entire rectangle 
Ro. We now suppose that P(xy) is defined on a closed set S interior to Ro 
and show how to extend its definition to the entire plane. To this end let 
(xovo) be a point not in S. Let zo be the lower bound of all x’ such that for 
x’ <x<x, the point (xy) is not in S. Since S is closed it is clear that if zo 
is finite the point (zeyo) is in S. Similarly let # be the upper bound of all x’’ 
such that for x»<x<x"’ the point (xyo) is not in S; if Zo is finite, the point 
(Zoyo) is in S. We now define P(xoyo) as follows: 


P(xovo) = P(xoyo) + (xo — 2X0) P(xoFo, yo), 


if xo, Zo are both finite, where 


P(xo%o, yo) = [P(Zoy0) — P(xoyo)]/ [Zo — xe]. 


We also make the following definitions: P(xoyo)=P(Zoyo) if xo only is 
infinite; P(xoyo) =P(xoyo) if only is infinite; P(xoyo) =0 if both zo and Zo 
are infinite. We call the function whose definition has been thus extended 
the x-linear extension of P. 


410 [April 
All Atl 


1928] RELATIVE CONTENT AND GREEN’S LEMMA 411 


THEOREM 11. If P is defined on and interior to a simple closed curve C, 
is continuous on C and has a bounded first partial derivative P, on K, the in- 
terior of C, then the x-linear extension of P is continuous as to x uniformly 
as to (xy) on Ro. 


To prove this we note that since P is continuous on C it is uniformly 
continuous there, that is, there is a system (d/ |e) such that 


| P(xiy1) — P(xey2)| 6/3 


for on C and 

We say two points (xo’yo), (%o’’yo) are of the first kind if they are both 
on C; of the second kind if (xyo) is inside C for every x between xg and x,’ ; 
of the third kind if (xyo) is outside C for every x between xg and xy’. 

Consider first a pair (%o’yo), (x¢’ yo) of the first kind. It follows from the 
above that for such a pair 


| P(x0' yo) P( yo) | = ¢/3 
if lad | iy 
Next consider a pair of the second kind. Here by the mean value theorem 


| P(xo’yo) — P(xo’’yo)| S| Pa(x’ yo) | | xo’ — xo” | S N| x0’ — x0’ | 


if |x —x¢’ | <d/’, where d/’ is the smaller of d/ and e/(3N), N being the 
least upper bound on P, on K, and x’”’ is between x¢ and xq’. 

Consider next a pair of the third kind. In this case there is a pair of the 
first kind and also of the third kind* (xo!yo), (xo!yo) such that x '<xj, 
xi’ Sx". But then by definition of x-linear extension 


| — | = | P( axel yo) — yo) | — x0’ |/| xo! — | Se/3 


for |x —xd'|<d., where d, is the smaller of d/’ and ed//(6M), M being 
the least upper bound of P on C. 

We now consider an arbitrary pair of points (xo’yo), (x0'’ yo) in Ro such 
that |x —x¢’ | <d.. The interval (x¢ x’) can be broken up into at most 
three sub-intervals each of which with yo gives rise to a pair of points either 
of the first or second or third kind. Hence 


| P( x0’ yo) — | ¢/3 + &/3 + e/3 =e 


for |x —x¢’ | <d,., which proves the desired theorem. 
Theorems 6, 7, 10, 11 now give 


* This is true unless P(xo’yo) = P(xo"yo), in which case the result is obvious. 


412 H. L. SMITH [April 


THeEorEM 12. If P is defined on a simple closed rectifiable curve C and its 
interior K, is continuous on C and possesses a bounded integrable first partial 
derivative P, on K, then K and K+C each have content equal to [[P.dxdy 
relative to P\y and C is absolutely squarable relative to P\y™, where 
P, is the x-linear extension of P. 


4, The generalized Green’s lemma. It is the purpose of this section to 
prove 


THEOREM 13. If P is defined on Ro, and C, a simple closed curve interior 
to Ro, is absolutely squarable (Py), if K, the interior of C, has inner content 
(Py), and if, moreover, the integral [¢ Pdy exists,* then 


Pay = contp®,0K = contp@,w(K + C). 
Cc 

Let 
C: x = $(t), y = 


be parametric equations of C such that as ¢ varies from 0 to 1, C is described 
in the positive sense. For brevity write 


Po(t) = Plo(t), 
Now let ¢ be a fixed positive number and >» a fixed partition of (01) into 
intervals Azo, 
To: to( = 0), 41, = 1), 
and suppose 7» is such that if +F7o,} that is, if x is a partition obtained by 
subdividing some or all of the intervals of 10, then 
é 


+ 


| f Pdy — S,P,Av |< 
Cc 


where 


S. = Po(Ar) + 


Next let us form a partition II, of $ by dividing Ro into horizontal 
strips pi, - - - , px Closed and non-overlapping (except for boundary points) 
and also into equal vertical strips a, - - - , 0, of the same character in such 
a way that 


* It is sufficient to assume the integral exists in the weak sense; that is all that is actually used. 
(Cf. the author’s paper cited above.) 
Loc. cit., p. 492, 


(0<¢<1) 


1928] RELATIVE CONTENT AND GREEN’S LEMMA 413 


(1) the width of each horizontal strip is at most equal to the common 
width of the vertical strips; 
(2) each of the points (¢(¢,), ¥(¢;)) which.corresponds to a division point 
t; of wo is on the common boundary of two adjacent horizontal strips; 
(3) the inequality 
> | | y(alle(C, Al) — 
Au 2 
holds for every partition IIFII, ; 
(4) the inequality 
e 


contp® wK — (All) y [1 — «(Ry —K, 
aul 


holds for every partition IIFII.. 

Let us now consider the intersection K, of K with the interior of p,. 
Since it is a region (that is, set of inner points), it can be resolved (uniquely) 
into a finite or denumerably infinite number of connected regions: 


Ky = Qri + + 


We say a region Q),; is of the first kind if its boundary contains an arc of C 
which has points of intersection with both the upper and the lower boundaries 
of p,; otherwise Q,; is of the second kind. It is easily shown that for given r 
there are but a finite number of Q,,; of the first kind; suppose the notation so 
chosen that they areQn, - - - , Q,:,. Let QO, be the sum of the Q,,; of the second 
kind. Then 


Ky Or + 
t=1 


where all the Q,; are of the first kind. 

It is now easily shown that since Q,,(¢=1, -- - ,7,) is connected, its 
boundary consists of (1) an arc a}; of C lying entirely within p, except for its 
end points, of which the first* lies on the lower boundary of p, and the second 
on the upper boundary of p,; (2) an arc aj of the same character except that 
its first and second end points are respectively on the upper and lower 
boundaries of p,; (3) a finite or denumerably infinite number of arcs of C 
each with its end points on the same boundary line of p,; (4) a finite or de- 
numerably infinite number of segments of the upper and lower boundaries 
of pr. 


* The first end point is the one which corresponds to the smaller value of ¢. 


414 H, L. SMITH [April 


We next form a certain partition 7 of (01). To this end let J/,, I/{ be the 
t-intervals corresponding to arcs aj;, a/{, respectively. If a division point 
t; of mo is not an end point of some J}; or J//, it is an end point of some J; 
which is the ¢-interval corresponding to an arc of C which lies entirely in some 
strip p, and has its end points on the same horizontal boundary line of that 
strip. Now take 7; as the partition whose points of division are the end points 
of the intervals J/;, J/; and existent intervals /,. 

It can now be proved that if Az; is an interval of 7, then ¥(Am:) =~(A7) 
—w(Am:) =0 unless Am is an Jj, or an For then Am; is either (1) an J,, 
or (2) between an J, and an J’; or I/!, or (3) between two intervals of types 

), I//. In case (1), ¥(Am1)=0 obviously. The same also holds in case 
(2); for otherwise Ar, would correspond to an arc of C with end points on 
different horizontal boundary lines of p,. But then this arc would contain an 
arc lying entirely in some p, and with its end points on different horizontal 
boundary lines of that p, and would therefore correspond to an //; or to an 
I/}, that is, Aw, would contain some J/; or some J//, contrary to hypo- 
thesis. The case (3) is similar, and the conclusion is established. 

From what has just been proved, it follows that 


= Poles) + Po(Les) } 


(1) + Poles) + Po(Ls) 
But 

WL) = = ye, say, 

= = say, 


= — = Aye, say. 
Moreover let us write 

Then (1) may be written 


1 
S,°P = Day, — } 


fr 


1 a? 
(2) + Lay — } 


1928] RELATIVE CONTENT AND GREEN’S LEMMA 


Now let R,;, Ri; be the rectangles 


Res: Sx Sin, 


Ry 


Also let a &., be respectively the upper and lower bounds of values of x 


for points (xy) in the (existent) rectangle o,- R/;. Then 


(3) — Ave = — Fe) 


Therefore 


contp®,~ K — > [PE | 


+> 


[1 Ors, os Ris) Jay, 


é, 


Similarly 
3 


so that by (4) 
(4) | contp@,oK — S,,°PoAy| < Ze. 


(5) f Pdy — | She. 
Cc 


From (4) and (5) the theorem follows, since e is arbitrary. 


5. On the existence of the line integral of Green’s lemma. Let the curve 
C and its parametric representation be as above with the additional re- 
quirement that the representation be one-to-one for O<Xi#<1. Let P(xy) 
be defined on Ro. We seek sufficient conditions that fc Po(t)dy(t) exist, 


where Py is as above. The first such condition is given by 


TuHeEoreEM 14. If P is continuous on C, the integral fc Po(t)dyp(t) exists if 


W(t) is of limited variation, in particular if C is rectifiable. 


415 
&; 

Lei VS YS 

But 


416 H. L. SMITH [April 


For then P,(é) is continuous and the theorem follows from a well known 
theorem on the Stieltjes integral. 

We next obtain a condition less restrictive on C; to this end we first prove 
two lemmas. 


Lemma 2. In order that 
L ¥(Ax)| = 0 
Ar 


it is sufficient that 
(i) P satisfy the Lipschitz condition 


| P(x1y1) P(x2y2) | sA | — x2 | + B| | 
for every pair of points (xyy:), (x22) on C; 


(ii) L B (Os) | | = 0; 
and 
(iii)* LA | | = 0. 


To prove this note that for every partition 7 of (01) there is a system 
7 |Aw, where £77, are in Az, such that 


0 < OarP =.2 [ Po(t7) Po(t37) | 


S 2A | — | + — | 
2[AOard + BOs}. 
Hence 
< [4 2 (Osrd) | | + B | | | 
Therefore 
L 2. (OaxPo) | | = 0, 


as was to be proved. 
Lemma 3. The condition 
L | ¥(Ar)| = 0 
Ar 


* The condition (iii) is equivalent to the same condition with A omitted if A #0; but it is desired 
to include the case when P is independent of y, in which case A may be taken to be 0; the condition 
is then satisfied for all y. A similar remark applies to (ii). 


1928] RELATIVE CONTENT AND GREEN’S LEMMA 417 


is sufficient for the existence of f, Po(t)d(t) provided that P is continuous on C 
and that P(x'y)<P(x''y) for every pair of points (x'y), (x’’y) on C such 
that x’ <x"", 


For then the functions P(t), y(é) satisfy one of the sufficient conditions 
of Corollary 1, p. 505 of the author’s paper cited above. 
We can now state the desired condition as 


THEOREM 15. The integral Po(t)dy(t) exists if 
(i) P.(xy) exists and is less in absolute value than a fixed number N on Ro, 
and for each y, P.(xy) is a continuous function of x; 
(ii) P, P, satisfy the Lipschitz conditions 


| P(xy1) — P(xye) | — y2| 


| P.(xyi1) — P.(xy2)| D| — 


for every pair of points (xy:), (xye) on Ro; 


(iii) L(C + D) (Os-¥) | = 0; 
(iv) LN | | = 0. 


To prove this let us form the functions 


1 z 
P'(xy) = Play) + > f { | P.(uy) | + Po(uy)}du, 


1 z 
Clearly 
(6) P(xy) = P’(xy) — (xy) 
and 
(7) P’(x'y) < P'(x'’y), P’'(x'y) < (x’ < 


Now consider the function P’(xy). We have 


1 
(8) | P’(xiy) — P’(xey)| = 


H. L. SMITH [April 


Also 
| P’(xy1) — P'(xy2) | | P(ay1) — P(ayz) | > f { | P 


| | + P.(uyi) — P(uy2)}du| 


| Play) Play) | + f "| — Palys) | du 


C\ — y2| + f Di 


— y2| +(x —a)D| — 

where E=C+(b—a)D. 

From (8) and (9) we get 
(10) | — P’(x2y2)| | — | 
+ | P’(x1y2) P’ (x24) | = N| | + E| ys | 
If we write 
Po (t) = 

we see, from (10) and the hypothesis, that the conditions of Lemma 2 are 
satisfied* and hence 


(11) L = 0. 
Ar 


But (11) and (7) show (Lemma 3) that /,' P¢ (é)dy(t) exists. Similarly if 
(t) =P’’ [¢(2), it can be shown that /,'P¢’ (é)dy(t) exists. Hence, 
by (6), Po(t)dp(é) exists and equals f,' Pd (t)dy(t) and 
the theorem is proved. 

6. Two special cases. We can now state two important special cases of 
Green’s lemma. The first is given by 


THEOREM 16. If P(xy) is defined and continuous on a simple closed recti- 
frable curve C and is defined and possesses a bounded integrable partial derivative 
P (xy) on K, the interior of C, then [,.P(xy)dy exists and 


f P(xy)dy = J 


* The continuity of P in x and y together follows from (i) and (ii) which imply respectively that 
P is continuous in x for every y and in y uniformly as to x. 

t This is the result obtained by Gross, except that he assumes P(xy) to be summable instead of 
Riemann integrable. 


418 
a 


1928] RELATIVE CONTENT AND GREEN’S LEMMA 419 


This theorem follows from Theorems 12, 13, 14. 
The second is given by 


THEorEM 17. If P(xy) is defined on Ry and possesses a partial derivative 
P..(xy) on the interior of Ry and if C is a simple closed squarable curve interior 
to Ro, and K is its interior, then 


P(xy)dy = J f 


provided {, P(xy)dy exists and P,(xy) is bounded and integrable on Ro, in 
particular, provided 


(i) P.(xy) is, for each y, a continuous function of x; 
(ii) P and P, satisfy the Lipschitz conditions 


| P(xy1) — P(xy2)| C| 91 — ye! , 
| P.(xy1) — P.(xye)| S D| yi — ye! , 


for every pair of points (xy), (xy2) in Ro; 


(iii) L&C + D) (Os-¥) | | = 0; 
(iv) LN | | = 0, 


where N is the upper bound of |P.| on Ro. 


This theorem follows from Theorems 8, 9, 13, 15, since the hypothesis 
implies the continuity in x and y together of P and P, (see first footnote, 
p. 418). 


LovulIsIANA STATE UNIVERSITY, 
Baton Rouse, La, 


ON BELL’S ARITHMETIC OF BOOLEAN ALGEBRA* 


BY 
WALLIE ABRAHAM HURWITZ 


1. Introduction. Bellt has constructed an arithmetic for an algebra 
of logic or (following the terminology of Sheffert) a Boolean algebra, which 
presents gratifying analogies to the arithmetics of rational and other fields. 
In only one detail is the similarity less close than seems appropriate to the 
difference in the structures of the algebras themselves—namely, in the 
properties of the concept congruence. In this note I shall show that a slightly 
more general definition retains all the properties of the congruence given 
by Bell and restores several analogies to rational arithmetic lost by the Bell 
definition. 

All the notation and terminology of the paper of Bell (to which reference 
should be made for results not here repeated) other than those relating to 
congruence are followed in the present treatment. In particular, we utilize 
the two (dual) interpretations of arithmetic operations and relations in &: 


Name Symbol Interpretation I Interpretation II 
(s) Arithmetic sum: a+6, aB; 
(p) Arithmetic product: apB: aB, at+B; 
(g) G. C. D.: agB: a+6, aB; 
(i) L.C. M.: : aB, at+B; 
(¢) Arithmetic zero: : w, €; 
(v) Arithmetic unity: : 
(d) a divides B: : a|B, B| a. 
2. Residuals. It will be convenient to amplify slightly Bell’s treatment 
of residuals. By the residual of b with respect to a in Y, bra, is meant the 
quotient of a by the G. C. D. of a and 6. Transforming this into a form 
equivalent for % and suitable, by the non-appearance of the concept quotient, 
for analogy in &%, Bell uses for % substantially the following: Bra is the 
G. C. D. of all \ such that @ divides the arithmetic product of \ and 8. 
If we use interpretation I, we have that fra is the algebraic (in this case 
also arithmetic) sum of all \ such that a NB. But for any such A, ABa =V8 


* Presented to the Society, April 7, 1928; received by the editors August 20, 1927. 

t E. T. Bell, Arithmetic of logic, these Transactions, vol. 29 (1927), pp. 597-611. 

tH. M. Sheffer, A set of five independent postulates for Boolean algebras with applications to 
logical constants, these Transactions, vol. 14 (1913), pp. 481-488. 


420 


BELL’S ARITHMETIC OF BOOLEAN ALGEBRA 421 


for which it is necessary and sufficient that \=t(a+$’), where ¢ is any 
element of 2. The sum of all such A is a+’. 

If we adopt interpretation II, we find'similarly that the residual of 6 
with respect to a is a8’. We may thus add to the table of interpretations 


(r) Residual: Bra: at+p’, ap’. 
For both interpretations we may write 


Bra = asf’. 


3. Congruence. In rational number theory the assertion a=b mod m 
means that a—b is divisible by m. If we desire to remain within the set 
of non-negative rational integers, we may say that a=b mod m if there 
exist c, x, y such that a=c+mx, b=c+my. We adopt correspondingly as 
the definition for 2, in place of Bell’s (1.1)-(1.7): 


mod if there exist £, 7 such that a=ys(upt), B=ys(upn). 


We shall see that this definition satisfies Bell’s (1.1)-(1.4) and the Boolean 
analogies of his (1.5), (1.6) just as do his own interpretations (4.1), (4.2), 
gives even a better analogy for his (1.7), and preserves several other im- 
portant analogies with rational arithmetic which otherwise fail. 

Under interpretation I we have for a=8 mod yu 


a=ytut, 


Multiplying (algebraically) by uw’, we find ap’ =yy’, Bu’ =yy’; hence ay’ =By’. 
But conversely this condition is sufficient for a=8 mod y; for if au’ =By’, 
we may choose y =ayp’ =By’, E=a, 

Under interpretation II we find similarly that for a=$ mod uy it is 
necessary and sufficient that a+y’=8+yp’. We therefore replace Bell’s 
interpretations by the following: 


(c) a = Bmodz: ap’ = By’, 


In both cases, a=8 mod yp if and only if apy’ =Bpy’. 

4. Satisfaction of Bell’s conditions. The first four of Bell’s conditions 
(1.1)-(1.4), stated directly for 2 in terms of the congruence notation, are as 
follows: 

If a=8 mod yp, then B=a mod u. 

If ~=8 mod uw and B=y mod yu, then a=y mod zu. 

If a=B mod wp and y=6 mod yg, then asy =8s6 mod 

If mod and y=6 mod p, thenapy mod 

That these hold under our criterion apy’ =Bpy’ is evident. 


422 W. A. HURWITZ 


Bell’s statement of (1.5) has as its Boolean analogue: 
a=f mod uz if and only if u divides a. 
Testing in interpretation I, we have that ay’ =w if and only if u|a; that is, 
ap’ =w if and only if az =a, which is true. 

The Boolean analogue of Bell’s (1.6) is the following: 

If xx=x8 mod p, then a=8 mod 

Under interpretation I, the hypothesis is (xa)u’ =(x8)u’, and the conclusion 
a(u+x’)’=B(u+x’)’; but these statements are identical. 

Bell’s last condition (1.7) is intended to furnish the analogue of the 
following in rational arithmetic: a=a mod m. Such an analogue holds, 
under Bell’s (4.1, 4.2), only in the special form (equivalent in the rational 
case) 02=0 mod m. But obviously with the definition of the present paper, 
we have the complete analogue 


a=amoduz. 


In close relationship to this result lies the fact that under Bell’s form 
of congruence no two elements of a Boolean algebra can be congruent unless 
each is congruent to the arithmetic zero. Indeed, we may compare the 
generality of the two ideas by observing that while our definition makes 
a=B mod uz if and only if apy’ =B8py’, Bell’s definition makes a=8 mod yu 
if and only if apy’ =Bpy'’ =f; the latter thus singles out ome of the residue 
classes into which we shall in the next section distribute all the elements of a 
Boolean algebra. 

5. Residue classes. In rational number theory (with positive and negative 
integers) a and b belong to the same residue class with respect to m if a=b 
mod m. The residue class of an element a contains all elements x which can 
be written in the form x=a+my. If we restrict ourselves to non-negative 
integers we may say that the residue class of a consists of all x such that 
x=a-+my and all x such that a=x+my. We may then naturally call an 
element a of a residue class the generator of the class if every member «x of the 
class can be written in the form x=a+my. We shall say similarly for 2: 
the residue class of a consists of all £ such that £=as(upn) and all such that 
a=£s(upn); a is the generator of its residue class if and only if every member 
£ of the class can be written in the form £=as(upn). The following theorems 
are then analogues of theorems in the non-negative rational case: 


Every residue class with respect to a modulus pw possesses one and only one 
generator. 

a=B8 mod pz if and only if a and B belong to the same residue class with 
respect to the modulus u. 


[April 


1928] BELL’S ARITHMETIC OF BOOLEAN ALGEBRA 423 


We shall confine the proofs to interpretation I. To prove the first theorem, 
let a be any member of a residue class; then ap=ay’ is a generator. For if 
either £=a+ypn or a=£+y7, then gu’ =ap’, and There can not 
be two distinct generators ao, a1; for if a; =ao+pno and ap =a;+pym, it follows 
that avo and ao lox, so that 

The second theorem is obvious, since the generators of the residue classes 
of a and 8, which are respectively ay’ and By’, will be equal if and only if 
a=B6 mod uz. 

It is clear that the elements of a Boolean algebra which can participate 
in Bell’s definition of congruence are those belonging to the single residue 
class whose generator is ¢. 

6. Coprimality. Two elements a, 8 of a Boolean algebra are called 
coprime (Bell, (22.1)) when ag8=v. We may express this in terms of con- 
gruence (without any precise analogue in rational arithmetic): a and B 
are coprime if and only if a=v mod Bf. For the definition of coprimality, 
ag8=v, is the same as asB=v, which is equivalent to apf’ =vpP’ or a=v 
mod £. 

7. The linear congruence; arithmetic reciprocals. It is remarkable that 
while algebraic division (i.e., solution of linear equation) in 2 is nearly 
always impossible or non-unique, arithmetic division with respect to a 
modulus is unique under the same hypotheses as in rational arithmetic and 
possible under the same hypotheses as in rational arithmetic. 


If a, up are coprime, there exists one and (congruentially) only one & such that 
apt=6 mod uz. 


For a=v mod yp, by the preceding section, and = mod y; thus the given 
congruence is equivalent to £=8 mod uz. 
As a special case we have the following: 


If a, p are coprime, then a has with respect to the modulus y one and (con- 
gruentially) only one reciprocal. 


The value of the reciprocal is given by £=a=v mod uz. 
For the case of more general a, wu we have the following theorem: 


The congruence apt=B mod yp has no solution unless (agu)Bd; if this con- 
dition holds, and 


age orp = #1, bpay, 
then every solution is given by 


= mod p, 


424 W. A. HURWITZ [April 
where &, is a properly selected element of the algebra satisfying the congruence 
ai pf: mod ju, and is arbitrary. 

We give the proof under interpretation I. Let 
(1) at = mod yz, 
(2) b=atu. 
Then 

atu’ = Bu’, B = atu’ + Bu; 

since a and it follows that 6| 

Now let 
(3) +4, a = ba, B = 
The congruence a,¢;=; mod y; has a solution; for 

a + = + 5’) + + = + + + 8’) 
=e, 


so that ai, are coprime and a; =e mod yw. A solution is £; =8; for af; =a,8 
= = €58; =58,;=B8 mod We shall show that 


(4) = 6+ nui modu 


is for every 7 a solution of (1). By (2), (3), a=6 mod y, and uw, =5’ mod yp; 
hence 
at = = 68 + modu, 


at = B + 756’ mod 
at = B modu. 


Conversely every solution of (1) is of the form (4) for some yn. For from 
(1) and (3) we deduce that da,¢ =56; mod y, and by Bell, (1.5), 


aif = Bimod wy. 


Hence £=6; mod wu, & is with respect to mw; in the residue class generated by 
Bywi, and But Biwi =6:du’=By’, is with respect to 
in the residue class generated by Sy’, and 8,ui =8 mod uw. Thus (4) must hold. 


CorRNELL UNIVERSITY, 
Irwaca, N. Y. 


4 


A THEOREM ON ORTHOGONAL FUNCTIONS WITH AN 
APPLICATION TO INTEGRAL INEQUALITIES* 


BY 
LLOYD L. DINES 


It is well known that for a given finite set of functions 


{fi} fi(x), » She), 
continuous on an interval 
(%) 


a continuous function f(x) can be determined which is orthogonal to all 
functions of the set, that is, the conditions 


f f(x)f(a)dx = 0 


can be:satisfied. The principal object of the present paper is to determine 
under what conditions such a function f(x) can be everywhere positive. 
This object is attained in the following 


THEOREM I. A necessary and sufficient condition that a set of functions, 
continuous and linearly independent on a closed interval, admit a positive 
continuous function orthogonal to all of them is that every linear combination 
of the functions change sign on the interval.t ; 


In the last section of the paper an application of this theorem is given 
in a study of the integral inequality 


b 
(1) $(x) + f «(x,s)4(s)ds > 0. 


The principal result in this connection is the following 


* Presented to the Society, San Francisco Section, under different title, June 12, 1926; received 
by the editors in November, 1926. 

t This theorem is an analogue of an algebraic theorem given in a recent paper, Note on certain 
associated systems of linear equalities and inequalities, Annals of Mathematics, (2), vol. 28 (1926-27), 
pp. 41-42. 


425 


if 
| 
|_| 


426 L. L. DINES [April 


THEOREM II. A necessary and sufficient condition that the integral inequality 
(1) admit a solution (x) is that every non-trivial solution (x) of the associated 
integral equation 


W(x) + f V(s)x(s, x)ds = 0 


shall change sign.* 


The first part of the paper is devoted to preliminary notions which are 
used in the proof of Theorem I. 

1. An outline of the proof of Theorem I. The necessity of the condition 
in Theorem I is almost obvious. For if f(x) is orthogonal to each of the 
functions f;(x), it is orthogonal to every linear combination of them, that is 


m 
fla) (a)dx = 0; 
a j=l 
and this relation for a positive f(x) demands that the second factor of the 
integrand change sign unless it is identically zero. Hence since the functions 
are linearly dependent, the condition of the theorem is necessary. 

To prove the sufficiency of the condition, we proceed as follows. By a 
certain well defined operation, the given set of m functions {f;} is replaced 
by another set of m functions called a reduced set.{ The efficacy of the 
reduction is due to the following two properties: 

(a) One function of the reduced set is identically zero. 

(b) If the reduced set admits a positive function orthogonal to all of its 
members, the same is true of the original set. 

The reduction process may be repeated, the property (b) persisting, and 
the property (a) introducing a new zero function at each repetition; so that 
after m—1 reductions the resulting set contains only one function which 
is different from zero. It is not difficult to show that this function admits a 
positive orthogonal function, whence from (b) it follows that the given set 
admits a positive orthogonal function. 

A complication arises from the fact that the range of the argument of a 
“reduced” set of functions is not the same as the range of the argument of 
the original set. The functions of the reduced set are as a matter of fact 
functions of a variable whose range is a composite range. 


* For an analogous algebraic theorem, see papers by Carver, Annals of Mathematics, (2), vol. 
23, p. 212; and the author, ibid., vol. 27, p. 57. 

t A reduction process for a set of functions on a general range has been described by the author 
in an earlier paper, On sets of functions of a general variable, these Transactions, vol. 29 (1927), 


pp. 463-470. 


1928] A THEOREM ON ORTHOGONAL FUNCTIONS 427 
2. Reduction and composition of a range relative to a function on it. 

Let us consider any continuous function p(x) on the range 

(X) ax<x<b, 


which changes sign on the range. Relative to the function p(x), we may 
determine two subclasses of the range %, defined as follows: 


Xp) = [all x such that p(x) = 0}, 
Xv) = [all x such that p(x) < 0]. 


For convenience we denote the elements of these subclasses by p® and n 
respectively; thus 


(2) = [p], = [n>]. 


From the properties of continuous functions we may draw the following 
conclusions with reference to these subclasses. The subclass ¥y®) consists 
(geometrically speaking) of linear intervals open at both ends except when 
terminated by an end point a or 6 of the fundamental interval ¥. The 
subclass Xp is a closed point set. If the reducing function p(x) is of simple 
type (for example if it changes sign at each point at which it vanishes), 
¥p® consists of closed linear intervals; and it will in fact always contain at 
least one such interval.* 
From the two subclasses (2) we now form a composite range 


XO) = 
the elements of which are bipartite, of the form pn. The new range 
XO) = [x] = 


is a two-dimensional point set contained in the square XX. If p(x) is of a 
simple type, the points constitute a set of rectangles forming a sort of ir- 
regular checker-board arrangement. And for any p(x), the range will in- 
clude at least one such rectangle. 

A function on the range ¥” will be said to be continuous if it is a con- 
tinuous function of the two variables p®, n™. A function will be said to 


* But since x? includes the zeros of p(x), it may contain isolated points, and even perfect sets 
of points which comprise no interval, as in the following example. Let X be the closed interval 
(0, 2). On the left half of this interval form the Cantor perfect set by the removal of open middle thirds 
(see Pierpont’s Theory of Functions of Real Variables, vol. 1, § 272). On this perfect set let p(x) =0, 
and on its complement with respect to (0, 1) let p(x) be negative. For x>1 let o(x) be positive. This 
interesting example was suggested to me by Professor Kellogg, to whom I am indebted for a number 
of valuable criticisms and suggestions. 


H 
| 


428 L. L. DINES [April 


change sign internally on the range X” if it is positive at some inner points 
and negative at other inner points of the range. Analogous definitions of 
continuity and internal change of sign are to be understood relative to all 
the composite ranges occurring in what follows. 

Suppose now that on the new composite range ¥ there is defined in any 
way a real, single-valued, continuous function o, which changes sign in- 
ternally on the range. Then this function determines two subclasses of the 
range 


Xp) = [all x) for which o = O}, 
= [all for which < 0] ; 
and from these subclasses we may form a composite class 


¥ (ee) = 


The elements of this class are quadripartite. Geometrically they form a 
point set in the four-dimensional hypercube X¥X¥X, of which some subset 
at least are inner points. 

The process of reduction and composition can be repeated indefinitely, 
provided at each stage a reducing function is available. To generalize the 
procedure and notation, suppose reduction with respect to the successive 


reducing functions --- , has yielded the composite range 


consisting of 2*-!-partite elements. Suppose further that p, is a single- 
valued continuous function changing sign internally on this range. By 
reduction with respect to p, and composition we obtain the new range 


the elements of which are 2*-partite. Geometrically the new range is a point 
set in the 2*-dimensional cube ¥X - - - ¥, of which some points at least are 
inner points. 

3. Reduced outer multiplication. Consider again the original range ¥ 
and a continuous reducing function p(x) changing sign on ¥. This function 
determines with any second continuous function f(x) a function on the 
composite range X), which we shall call their reduced outer product and 
denote by ((pf)). Its functional values are defined by the formula 


((ef)) p(p)f(n)—f(p)e(m), (p,m) on 


1928] A THEOREM ON ORTHOGONAL FUNCTIONS 429 


This multiplication is clearly not commutative. Its most obvious property 
is that ((pp))=0. Another property easily verified is that ((pf)) is con- 
tinuous on 

Reduced outer multiplication is defined in a similar way upon any of the 
composite ranges described in the preceding section. Consider for example 
the reduced composite range 


) 
(ive Pk-1), 


and suppose that p; is a continuous function changing sign on this range. 
Then p;, determines with any second continuous function f, on the range 
a reduced outer product ((p:f:)), given by the formula 


This product is continuous at all points of its region of definition. 
4. Reduction of a set of functions. Consider the set of functions 


fi} fila), folx), fm(x), 


continuous on the range %, and suppose that f,(x) changes sign. 
Relative to f:(x) we form a new set of m functions 


each of which is the reduced outer product of the corresponding function 
of the given set by fi. All functions of this reduced set are on the composite 
range X‘/), and are continuous on that range. It is to be noted particularly 
that the first function ((fif)) is identically zero. (See property (a) of §1.) 

The reduction process may now be repeated, the second function f,” 
(=((fifz))) being used as a reducing function (assuming that it changes 
sign), and a second reduced set of functions obtained, having the property 
that its first two functions are identically zero. Its range ¥%¥2") will be de- 
noted for brevity by ¥°. 

To make the reduction procedure and notation general, let {f12 ---*-»} 
denote the set of functions obtained by k—1 successive reductions of the 
kind indicated. Its first k—1 functions are identically zero, and its kth 
function is f,9*-*-*-». If this function changes sign we may use it as a 
reducing function and form the new set {f,"?-*-©}, each function of which 
is the reduced outer product of the corresponding function of {f,?---*-»} 
by f,%--'*-), The functions of this new set are defined and continuous on 
the range ¥“!2--”), and the first & of them are identically zero. 


430 L. L. DINES [April 


The reduction procedure thus defined will after m—1 operations yield 
a set of functions all of which except the last are identically zero. The as- 
sumption which we have made that at each stage the reducing function 
changes sign will now be justified. 


Lemma I. If every linear combination 
Qifi(x) + + + Omfm(x) 
of the set of functions {f;(x)} changes sign on %, then every linear combination 
+ asf +--+ + anfa 
of the last m—1 functions of the reduced set {f°} changes sign internally on its 


range 


To prove the proposition indirectly, suppose that there is a set of con- 
stant multipliers a2, as, - - - , dm, such that 


(3) = 0 on 

j=2 
We first recall that the range ¥ is a composite range, ¥) =X¥pYXy, 
the first component class ¥p™ consisting of those elements of ¥ for which f; 
is positive or zero. For our present purpose, it is convenient to divide the 
class ¥p into two subclasses: 


= FPO 4 FX, 


where 
¥-) = [all x for which fi > OJ, 


¥z% = [all x for which fi = 0]. 


Next, recalling the definition of the reduced function f{, we may replace 
our supposition (3) by the two statements 


j=2 


(5) = 0 (2,n) on 


jm? 


Now since —f,(p’)fi(m) is certainly positive, we may obtain from (4) the ~ 
equivalent statement 


1928] A THEOREM ON ORTHOGONAL FUNCTIONS 431 


Si (2) 
da; > Ya 

film) 
The various values of the expression on the left side of (6) must have a 
greatest lower bound, and those of the expression on the right must have a 


least upper bound, which bounds may or may not coincide. 
In any case we may choose a constant a; such that 


fi(n) 


And from this double relation we obtain 


(6) on , on 


on ¥p™, mon ¥y™. 


(7) 20, Yaifi(n) 20 p’ on non ¥y™. 


j=l j=1 


Furthermore, division of (5) by —f,(”), which is certainly positive, yields 
a statement which may be written 


(8) =0 zon 


j=l 


But the statements (7) and (8) can be combined to give 


=0 x on 


j=1 


This contradicts the hypothesis of the lemma, and the contradiction proves 
that every linear combination )-j~2 a;f{9 must be negative somewhere on 
¥™. To see that it must be negative at an inner point, we note first that all 
points of ¥’p Xv are inner points. If it is negative at none of these points, 
then it must be negative on ¥z%y, that is, the left side of (5) must be 
somewhere negative, and hence the left side of (8) must be negative at some 
point of ¥z™. If the left side of (8) is negative at an inner point of ¥z", 
then the left side of (5) is negative at an inner point of ¥z%y which is 
a fortiori an inner point of ¥“, and our desired conclusion is reached. If on 
the other hand the left side of (8) is negative only at frontier points of ¥2, 
it follows from continuity (since each such frontier point is a limit point of 
X¥’p or ¥y) that one of the inequalities in (7) is contradicted. But this 
involves a contradiction of (4), which contradiction means explicitly that 
> -ju2 af is negative on ¥’p Xy™, hence at an inner point of ¥“. In an 
entirely analogous way it may be shown that every such linear combination 
must be positive at an inner point of X. The proof of the lemma is then 
complete. 


= 
id 
& 


432 L. L. DINES [April 
An argument similar to the one we have just made suffices to prove 
the following 


Lemma II. If every linear combination 


(12+ ++ k—1) (12+ ++ k—1) (12+ + -k—1) 


of the last m—k+1 functions of the reduced set {t{%---*-} changes sign 
on its range X°2---*-, then every linear combination 


(12+ + +k) + +k) (12+ + +k) 


of the last m—k functions of the reduced set {f°* -:-®} changes sign internally 
on its range 

Successive application of these lemmas now justifies the assumption 
which we made in the early part of this section: 

If every linear combination of the given set of functions changes sign on &, 


then every function appearing as a reducing function in the progressive reduction 
process described in this section changes sign internally on its range. 


5. Proof of property (b) of §1. Two functions f and g defined upon one 
of the reduced composite ranges ¥“!*:--), will be said to be orthogonal one 


to the other if* 


Suppose now that k—1 reductions of the given set of functions 


fi, fe; » Sm 
have yielded the set 
on the range ¥“*---*-, and the reduction of this latter set with respect to 
has yielded the set 


(12-+-k) 
on the range 


* In this section and in those which follow, the integrals may be taken in the sense of Lebesgue 
whenever there is doubt as to their existence in the sense of Riemann. Of their existence in the 
former sense there will be no doubt. 


(12- + +k) (12- +k) 
| 


1928] A THEOREM ON ORTHOGONAL FUNCTIONS 433 


Suppose further that relative to this last set {f2---®} there is a function 
II'2---®), everywhere positive and continuous on the range ¥“?:--®), and 
orthogonal to all functions of the set. 

Then there is a function I1“?---*-), everywhere positive on the preceding 
range X°2---k-), which is orthogonal to all functions of the preceding set 


{ 


To prove this proposition we start with the hypothesis that the positive 
function II?---* satisfies the conditions 


(9) = Gj 


Since the defining formula for is 
(p, nm) on Xp? + -- 


we obtain, by substitution in (9) and decomposition of multiple integrals, 
the equalities 


xy 


P N 


=O = 1,2, m). 


These may be written in form 


(12+ ++k-1) f(12---k-1) — 
fi; G 1,2, 
the function II“*---*-» being defined on the range as follows: 


P 


N 


\ 


[J (12+ _| 


434 L. L. DINES [April 


It is obvious from this definition that the function II“?---*-” is positive 
on the range X¥“!*---*-, It is likewise continuous at all points except possibly 
at those points of ¥p“? ---» which are limit points of ¥y?°--”. 

In the next section we make an alteration in the procedure which will 
insure continuity at these points also. 

6. An alteration to secure continuity. In the preceding section the func- 
tion II“2---») is assumed to possess certain properties, and upon these 
assumptions the existence and certain properties of the function II“?---4#-» 
are established. The function II“?-~--* of the hypothesis is still unnecessarily 
general for our purpose. By making certain additional assumptions with 
regard to it we may expect to obtain additional properties for the function 
I[“2---4-), The additional property desired is continuity. To this end, 
we now assume that II“!?---») is equal to unity at all points of the range 
X%(12---*) except on some closed aggregates of inner points. 

Now if po is a point of ¥p?---» which is a limit point of ¥y“?---», 
then (fo, m) is, for every n, a frontier point of ¥“!2---”, and from the as- 
sumption just made it follows that II“?---») ($9, m)=1. Hence from the 
definition of 


N 


Furthermore, from the assumption it follows that for all points ” of ¥y“? ---») 
sufficiently near to po, II“!2---») (p, m)=1, and hence for such values of n 


Hence we secure continuity of II“?---*-» at all such points » on the range 
if 


P 


N 


We secure this property (10) by prefixing a slight preparatory operation to 
the reduction process previously described. 
In the set 


the function f,!*-*-*-, specified as the next reducing function, may or 
may not satisfy the equality 


A THEOREM ON ORTHOGONAL FUNCTIONS 


(I2-+-k-1) = Q 


If this condition is satisfied, then 


(2 ---k-l) = a positive constan 
P N 


and division of f,;? -:-*-» by this constant gives a new function by which 
it can be replaced, and the new function will have the property indicated by 
(10). 

If the condition (10) is not satisfied by a constant times f,*---*-” but 
the analogous condition is satisfied by a constant times some succeeding 
function in the sequence {f,!?---*-}, the two functions may be inter- 
changed and the desired property secured. 

Suppose then that no one of the functions f"*---*-» (j=k) satisfies 
the condition indicated by (10). Then we replace f,"*--:*- by a linear 
combination 


k af, + 


where the constants a and f are so chosen that the condition analogous to 
(10) is satisfied by the function The replacement of by 
this function will not affect any of the vital properties of our reduction 
process, and its use secures the desired continuity of the function II“2---*-», 

Furthermore, a consideration of the definition of II“?---*- in the preced- 
ing section, together with the additional hypothesis on II“?---» in the present 
section and the condition (10), shows that the function II“?---*-» js equal 
to unity at all points of its range ¥'2---*- except some closed sub-sets 
composed entirely of inner points of the range. Hence we have the following 
amplification of the property (b) as stated in §1: 


If all the functions of any reduced set admit a common orthogonal function 
which is positive, continuous, and equal to unity except on some closed sub-sets 
of inner points, then the set of functions from which it is obtained by reduction 
has the same property. 


7. Conclusion of the proof of Theorem I. Starting with the given set 
of functions 


{fi} filx), fo(x), » Sm(%), 


1928] 435 


436 L. L. DINES [April 


we obtain, after m—1 reductions of the type described in the preceding 
sections, a set 


12 -++m— 
0, ft m—1) 


in which all functions except the last one, f,(12 **:™-», are identically zero. 
If this last set admits a positive orthogonal function, continuous, and equal 
to unity except on closed sub-sets of inner points, then by §6 the given set 
admits a positive and continuous orthogonal function, and the proof of our 
theorem is complete. 

Our problem is then reduced to showing that the set {f,02---™-»} 
admits an orthogonal function of the kind described, or more simply still, 
that the single function f,{!?-**™-» admits such an orthogonal function. 
This is not difficult. 

From the hypothesis that every linear combination of the given functions 
changes sign, it follows by §4 that f,{?-::™-) changes sign internally. 
Therefore from the continuity preserved in the reduction process it follows 
that there is a set of inner points completely bounded by inner points on 
which the function is positive, and a similar set of points on which it is 
negative. We denote, for the moment, the former set by $ and the latter 
set by 9t, and consider two functions II, and II, on the range ¥“!2-** ™~-»), re- 
stricted by the following conditions. Both functions are continuous. The 
former II, is positive on’ the region {8 (excluding the boundary) and is else- 
where zero. The latter II, is positive on the region § (excluding the 
boundary) and is elsewhere zero. 

Next we consider the function 


a and 6 being constants to be determined. This function is evidently con- 
tinuous on ¥“!2--- ™—), and is positive if a and 6 are positive. Furthermore 
it is equal to unity except at points of $ and 9. 

We now choose a and £ so that 


[J f U2 ---m—-1) = 


which is equivalent to 


CS 
+B i... 0. 


1928] A THEOREM ON ORTHOGONAL FUNCTIONS 437 


Since the second integral is positive and the third integral is negative, 
a and 6 can be given positive values which will satisfy this condition. The 
function II‘!2---™-) defined by (11) is then positive, continuous on 
%12--+™—-1) and is equal to unity except at points of and 9. It is ortho- 
gonal to f,{!2-:*™-) and hence to all functions of the set {ff%---=-»}. 
From the existence of such a function there follows, as we have seen, the 
existence of a positive continuous function orthogonal to all functions of 
the given set {f;} and the proof of Theorem I is complete. 

8. Linear integral inequalities. As an application of the theorem proved 
in the foregoing sections we consider the following problem. 

Given the linear integral inequality 


(12) o(x) + f x(x,s)¢(s)ds > 0 


in which the kernel x(x, s) is continuous on the square 
8. 


Under what conditions will the inequality admit a solution ¢(x) continuous 
on the interval ¥? 
We note first that (12) is equivalent to an integral equation 


(13) $(x) + f = x(2), 


where z(x) is positive and continuous, but otherwise subject to deter- 
mination. 

If the Fredholm determinant D of the kernel x«(x,s) is different from 
zero, (13) possesses for any continuous z(x) a continuous solution 


1 b 


where D(x, s) is the first minor of D. Hence we have the result 


If the Fredholm determinant D of the kernel x(x, s) is different from zero, 
the general solution of the inequality (12) is given by the formula (14) in which 
w(x) is positive and continuous but otherwise arbitrary. 


In case D=0, the equation (13) for a given x(x) in general admits no 
solution. A necessary and sufficient condition for the existence of a solution 


438 L. L. DINES 


is that (x) be orthogonal to every solution of the associated homogeneous 
equation 


(15) + f W(s)(s,x)ds = 0, 


or what is equivalent, that +(x) be orthogonal to every one of a fundamental 
set of solutions of (15). Suppose such a set is 


{wi} ¥i(x), Y2(x), Vm(x). 


Then the inequality (12) will have a solution if and only if the set of functions 
{y;} admits a positive function +(x) orthogonal to all of them. 

By our Theorem I, a necessary and sufficient condition for the existence 
of such a function z(x) is that every linear combination of the functions of 
the set shall change sign. But the linear combinations of these functions 
constitute the non-trivial* solutions of the equation (15). Hence we have 


THEOREM II. A necessary and sufficient condition that the integral in- 
equality (12) admit a solution (x) is that every non-trivial solution (x) 
of the associated integral equation (15) shall change sign. 


* By a non-trivial solution we mean a solution which is not identically zero. 
t The case in which D0 is compatible with the theorem, since in that case the equation (15) 
admits no non-trivial solution, and the inequality (12) admits a solution (14). 


UNIVERSITY OF SASKATCHEWAN, 
SASKATOON, CANADA 


A THEOREM ON ORTHOGONAL SEQUENCES* 


BY 
L. L. DINES 


1. Introduction. In this paper we shall be dealing with infinite sequences 
of real numbers, and we shall use the functional form of notation. Thus, 
such a sequence may be considered as a real-valued function o(i) of a variable 
i, the range of z being the class of positive integers. 

A sequence will be said to be zero if all its terms are zero, positive if all 
its terms are positive, megaiive, if all its terms are negative, M-definite if 
it has some terms of one sign and no terms of the opposite sign, completely 
signed if it contains positive terms and negative terms. 

If e is any positive number greater than unity, and o’(z) and o’’() are 
two sequences such that the two series 


(1) Ll, Ll oH 


converge, then the series 
(2) 
i=1 
converges absolutely.{ If the sum of the series (2) is zero, the sequences o’ 
and o” will be said to be orthogonal. 

Assuming e to have any fixed value greater than unity, let us denote 
by G’ the class of all sequences o’ for which the first series of (1) converges, 
and by ©” the class of all sequences o” for which the second series of (1) 
converges. 

Each of these classes is closed under the operation of linear combination. 
That is, if o:(z), o2(z), - - - , om(z) are sequences of one of these classes, then 
the sequence o(z) defined by 


od) See 


j=1 


where the c; are real constants, is a sequence of the same class. 


* Presented to the Society, September 8, 1927; received by the editors August 18, 1927. 

¢ An M-definite sequence is an instance of the more general M-definite function defined in an 
earlier paper, On sets of functions of a general variable, these Transactions, vol. 29, p. 463. 

t Cf. F. Riesz, Les Systémes d’ Equations Linéaires a une Infinité d’ Inconnues, p. 45. 


439 


440 L. L. DINES [April 


The purpose of the present paper is to prove the 


THEOREM.* A necessary and sufficient condition that there exist in ©" 
a positive sequence a''(i) orthogonal to each of a given set of sequences aj (i), 
af (i), ---, on (i) in S' is that no linear combination of the given set shall 
be M-definite. 


2. An outline of the proof. The necessity of the condition is almost 
obvious. For if o’’(z) is orthogonal to each of the given sequences, then it is 
orthogonal to every linear combination of them; that is 


(3) = 0, 


where o’(i) is any such combination. And the equality (3) is manifestly im- 
possible if o’(z) is M-definite and o’’(i) is positive. 

To prove the sufficiency of the condition we proceed as follows. By a 
certain well defined process, the given set of m sequences is replaced by 
another set of m sequences, called a reduced set. The essential features 
of the reduction are the following: 

(a) One sequence of the reduced set is zero. 

(b) If the reduced set admits a positive sequence orthogonal to each of 
its members, then the same is true of the original set. 

The reduction process may be repeated, the property (b) persisting, and 
the property (a) introducing a new zero sequence at each repetition; so that 
after m reductions the resulting set contains only sequences which are zero. 
Since any positive sequence is orthogonal to these zero sequences, it follows 
from (b) that the given set admits a positive orthogonal sequence. A 
complication presents itself however in the fact that the sequences of the 
“reduced sets” are multiple sequences, of a type which we now proceed to 
describe. 

3. Classes of multiple sequences. A k-tuple sequence may be thought 
of as a function of k variables, each of which ranges independently over the 
positive integers, or as a function of a single k-partite variable 
q=(i:, i2, - - - , ix) which varies over a composite range © consisting of all 
k-tuples of positive integers. 


* An analogous theorem relative to continuous functions of a real variable has been proved inan 
earlier paper, A theorem on orthogonal functions with an application to integral inequalities, these 
Transactions, vol. 30, p. 425. The two theorems justify a certain generalizing postulate which has 
been used in a theory of linear inequalities in general analysis. Cf. Bulletin of the American Math- 
ematical Society, vol. 33, p. 698. 


oo 


1928] ORTHOGONAL SEQUENCES 441 


Any class S of simple sequences gives rise to a class of k-tuple sequences 
which we denote by Gx", and define as follows: The class Ss* consists of 
all k-tuple sequences o(i:, i2,---, ix) for which there exist sequences 
o1(i:), o2(i2), - - , ox(tx) Of S such that 


| ix) | = o1(41)02(t2) te). 


Thus the classes G’ and ©” defined in §1 give rise to classes of multiple 
sequences Ss'* and Gs’’*. Relative to these two classes we note the following 
properties: 

(i) Each of the classes G«’* and Gx’ is closed under the process of 
linear combination. 

(ii) If i2,-- te) and o’’(i;, are sequences of Gs’* 
and ©,’’* respectively, then the multiple series 


converges absolutely. 
(iii) If o’(i;, a2, -- - , tx) belongs to and - - , belongs 
to Ss’’' where /=k+h, then the series 


converges absolutely for every (:41, ix42, - , 4), and the resulting sum is 
a sequence o”” (ix41, Of the class 

Definition. If the sum of the series (4) is zero, then the k-tuple sequences 
o’ and o” will be said to be orthogonal. 

4. Reduction. Consider any k-tuple* sequence p(q) =p(t:, i2,---, 
which is completely signed (that is, which contains at least one positive 
and one negative term), the symbol p being suggestive of the special rdle 
of reducing sequence. 

Relative to the sequence p(q), the range Q of the variable g can be divided 
into three well defined sub-classes: 


Op”) = [all g such that p(g) > 0], 
= [all such that = 0}, 
Oy”) = [all g such that p(g) < 0]. 


* A simple sequence if k= 1. 


442 L. L. DINES [April 


The three classes are mutually exclusive, and are complementary, that is, 
O=0p") +02 +Oy%. The elements of the respective classes will be 
denoted by the appropriate small letters: 


Op” = [p], = [z], = [x]. 


Corresponding to the division of the range 0, any k-tuple sequence 
a(q) can be divided into three well defined sections: o(p), o(z), o(m); and it 
will be convenient to use the notation 


o(q) = [o(p), o(z), o(n)] 


to bring the sections into evidence. The reducing sequence, which for 
simplicity is omitted from the notation, will always be known from the 
context or by explicit statement. 

In the sequel it will sometimes be desirable to replace one or more 
of the sections of a sequence by the corresponding sections of the identically 
zero sequence w(q): 


w(g) = [w(p), w(z), o(n)] = 0. 


Of particular importance are the reduced sequences of the following two special 
types: 


(5) opz(q) = [o(p), (2), w(m)], ow(g) = [w(p), w(z), o(n)]. 


We note the obvious property that if o(g) belongs to the class S+’*, then each 
of the reduced sequences (5) belongs to this class. 

5. The reduced outer product. The reducing k-tuple sequence p(g) de- 
termines with any second k-tuple sequence o(q) a certain 2k-tuple sequence 
called their reduced outer product, denoted by ((pc)), and defined as follows: 


(6) ((pc)) = ppz(qi)or(q2) — 


where the variables g,; and gz vary independently over the range Q and the 
reductions are made with respect to the first factor p(q) as reducing sequence. 

This type of combination of sequences is clearly not in general com- 
mutative. It does possess the following noteworthy properties: 

(i) If o is zero, or if ¢=p, then ((pc)) is zero. 

(ii) If p and o belong to G+’*, then ((pa)) belongs to Gx’. 

6. Reduction of a set of sequences. Suppose we have a set of m k-tuple 
sequences 


(7) oi (q), o2 (q), om (q), 


belonging to G+’*. Then relative to any one of them which is completely 


1928) ORTHOGONAL SEQUENCES 443 


signed, say a; (q), we may form a reduced set, viz. a set of m 2k-tuple sequences 


(8) ((oror)), (Corom)), 


each of which is the reduced outer product of the corresponding sequence 
of the given set by a/ (q). 

We shall make use of the following three properties of this reduction 
process: 


PROPERTY (a). The rth sequence of the reduced set is zero. 


Property (b). If there is in G#'** a positive 2k-tuple sequence which is 
orthogonal to each sequence of the reduced set (8), then there is in Gx''* a positive 
k-tuple sequence which is orthogonal to each sequence of the set (7). 


Property (c). If the set (7) admits no M-definite linear combination, then 
the same is true of the reduced set (8). 


Property (a) is an immediate consequence of §5 (i). 
To prove property (b), we have by hypothesis a positive 2k-tuple sequence - 
92) such that for 7=1, 2,---,m, 


Since the series on the left converges absolutely, we may write this in 
the form 


(9) >| Casas) = >| =0. 


From §3 (iii), it follows that each of the expressions in square brackets is a 
sequence of ©x’’*, and it is clear that the first of these sequences is positive 
and the second negative. If we denote them for the moment by 7’’(q) and 
v’’(q) respectively, we may write (9) in the form 


(10) = 0. 
Now, defining the k-tuple sequence o’’(q) by the equality 


o (q) = an(q) — vpz(q) 


the reductions being relative to the reducing sequence a; (¢), we replace (10) 
by the equivalent equation 


444 L. L. DINES 


(11) (q) = 0. 


The sequence o’’(q) is a positive sequence of S’’*, and since the relation (11) 
holds for 7 =1, 2, - - - , m, we have established property (b). 

To prove property (c) indirectly, let us suppose there exist constants 
C1, C2, * » Cm, such that* 


(12) Dei((o/ 

j=l 
Since ((¢/ @/)) is zero, the coefficient c, is arbitrary, and the term correspond- 
ing to 7=r may be omitted from the summation. Denoting this omission by 
an apostrophe over the summation sign, and substituting their definitional 
values for ((¢/ 0/)), we write (12) in the form 


De; { jw (qe) } =’0. 
j=l 


And to bring the sections of the reduced sequences into evidence we write 
in the more extended form 


(13) [or (pr), 07 (21) [w(p2) , (22) (2) 
= [o} (21) | pe) ,@(Z2) (m2) | } 2’0. 


Recalling that the ranges of the variables ,, 2:, m1, p2, 22, #2 do not overlap, 
and that the sections a/ (z), w(p), w(z), w(m) are identically zero, we may 
replace (13) by two simpler simultaneous inequalities 


(14) lor (pido} (m2) — of (pr)or (ma) } = 0, 


j=1 


(15) — (m2) 0, 
j=l 
the symbol = having the significance of =’ in at least one of them. 
Now since by definition the product —a/ (p;) o/ (mz) is positive for every 
(pi, M2), we may obtain from (14) an equivalent inequality 


as 


Or (Pi) j=l Or (M2) 


* The symbol! 2’ is to be read “is somewhere greater than and nowhere less than.” Thus (12) 
is equivalent to the statement that the left side is M-definite. 


[April 
q 


1928] ORTHOGONAL SEQUENCES 445 


The values on the left side of (16) have (for varying p;) a greatest lower 
bound, and those on the right (for varying m2) a least upper bound. These 
bounds may or may not coincide, but in any case we may choose the arbitrary 
c, so that 


oj (pi) > 


“——e2 
jm Or (Ps) 


From this double relation we obtain, since o/ (p) >0 and o/ (n) <0, 


(17) 2 0, Scie! (ms) > 0. 


j=1 j=1 


Furthermore, from (15) we obtain 


(18) = 0. 
j=l 


Since the ranges of f:, 2:1, #2, when combined adjunctively, form the range 
©, we may combine (17) and (18) in the single inequality 


(g) 2’ 0. 
j=l 
This result contradicts the hypothesis of property (c), and the contradiction 
proves the desired conclusion. 
7. Completion of proof of the theorem. We return now to the principal 
theorem of §1. By hypothesis we are given a set of sequences 


oi (i), (i), Tm 


of ©’, of which no linear combination is M-definite. We are to prove that 
there is a positive sequence of S”’ which is orthogonal to each sequence of 
the set. 

First of all we may assume that not all the sequences of the set are zero, 
since in that case the theorem is obviously true. Also, the sequences which 
are not zero are completely signed, since they cannot be M-definite. 

Suppose the first sequence o/ is not zero. We form the reduced set with 
respect to oj as reducing function, and denote it by 


0, of, of, @, 


It consists of m double sequences of Gx’, of which the first is zero. Other 
sequences of the set may be zero. If they are all zero, they all admit a 
positive orthogonal sequence, from which follows our theorem, by property 


a} (m2) 
—¢2 —— 
j=1 Or (m2) 
m 


446 L. L. DINES 


(b). In any case, the non-zero sequences are completely signed, by property 
(c). We choose the first such sequence—suppose for definiteness it is of“) — 
as a reducing sequence, and form a second reduced set 


0,0, of @, onl @, 


The process is now obvious. We repeat the reduction process, until after 
l(<m) reductions we obtain a set of h-tuple sequences (4 =2'), all of which 
are zero. 

Any positive sequence of ©s’’* is orthogonal to all sequences of this 
last reduced set, and the existence of such a positive sequence implies, by 
repeated application of property (b), the existence of a positive sequence 
in ©” orthogonal to each sequence of the given set. This completes the proof 
of the theorem. 


UNIVERSITY OF SASKATCHEWAN, 
SASKATOON, CANADA 


